[MUSIC PLAYING] DAVID MALAN: This is CS50, and this is the start of week eight. And we're so excited to welcome back, big surprise, CS50's own Ramon Galvan, a rising senior who has been spending the past several months since July in LA, in Hollywood, literally working on a brand new TV show called Colony, the creator of which is actually a Harvard alum himself. And so we're very excited to see this debut on the USA network this January. So stay tuned for that, and for more Ramon for the weeks to come. Know now that the end is near. And what this means is that there's not all that much left of CS50, sad to say. We have just three problem sets left-- there's problem set six-- which is in your hands now or soon will be, due later this week-- is meant to bridge our worlds of the command line, where we've spent most of our time using C, and the world of web programming. Well, you'll see a lot of ideas borrowed from the command line work, but also a lot of new and interesting ideas that are also going to be germane for mobile applications and for technology, more generally, with which you guys are all familiar nowadays on laptops and phones and the like. So you'll implement not a web page, or a website per se, but an actual web server. You will write the rest of a web server written in C, whose purpose in life is to receive HTTP requests, those virtual envelopes we keep talking about, and actually respond either with some static content-- like a dot HTML file, or a dot JPEG or any other number of files, or even a PHP file whereby your web server is going to interpret that PHP code and spit out the results. Now, we've provided you with quite a bit of framework for it-- indeed the distribution code for problem set six is over 1,000 lines long, a lot of which is comments, to be fair-- but this is really meant to be an opportunity to get your hands dirty diving into a fairly large project that we've very specifically carved out pieces of for you, so that really when you exit CS50 and enter the real world of programming and want to dabble in any number of projects, you'll have much greater comfort downloading some source code, some open source project on the web, and diving in and making changes that you see fit. Problem set seven is going to be about making your own web-based application that takes dynamic input and produces dynamic output in the form of a etrade.com-like website. And problem set eight will focus on yet another language known as JavaScript. Meanwhile, the final project is on the horizon. The so-called pre-proposal is due a week from today. Pre-proposal-- per the specification, which is on CS50's website-- is a pretty casual opportunity for you to send a pretty succinct email to your teaching fellow just to apprise him or her of what you're thinking, to use him or her as a sounding board. And have a sanity check-- whether you're thinking about biting off too much or maybe too little, or maybe you have no idea whatsoever and want to engage in a conversation. Thereafter is a proposal and status report, the so-called CS50 hackathon here in Cambridge for Harvard and Yale students alike. The final project's implementation is then due. And then a CS50 fair here, in Cambridge, as well as another in New Haven. So the proposal, take a look at the website for those particulars. But more excitingly, too, is an opportunity to get your hands dirty, and your minds open to a whole bunch of topics and tools and techniques that are ancillary to the course's core syllabus, but nonetheless related. And also wonderful stepping stones to doing really cool final projects that go well beyond material we've covered formally in problem sets or in lecture. So go to CS50's website for the whole roster of seminars. If you don't register yet, that's fine. Go ahead and sign up still and we will follow up with a live streaming link, the day and time is on the website. And everything will be recorded and put online if you can't make the particular days and times. As to what lies ahead thereafter-- well, of course, there's the CS50 hackathon. This photo, recall, from week zero taken around 4 AM one evening in years past. The CS50 fair, which again will take place in both cities. And then, just to plant the seed, even though we still have a month plus left of semester, if you'd like to join CS50's own teaching staff, and you want to start thinking about becoming a CA, or teaching fellow, know that we'll start talking more about that later this semester. But pictured here is most of this year's team. And so, PHP-- and I was so sad last week that [? Allyse ?] kindly went to the effort of getting us these wonderful props that I didn't end up using, so it really just looked kind of stupid that we had a shovel sitting here all day last Wednesday, and a little spoon. But this was my metaphoric way of trying to paint the picture of why we're transitioning from C to a language like PHP. And the same could be said of any number of languages-- Java, Python, Ruby or bunches of others-- but whereas in C, for instance, writing a program in C might typically be like taking a spoon like this and digging a hole in the ground, in the sand or the dirt. PHP allows you to take much bigger bites out of the problem, writing far less code using a far smaller tool, because there's so much more functionality pieced in. Now, if we were really dramatic, we'd have something to shovel here, but so be it. Meanwhile, the other metaphor we came up with is, of course, you could use something like a wrench to hammer in something like a nail. But of course, the right tool to use is going to be not so much the language called C-- and now I just annoyed [? Sanders, ?] probably, we'll fix that later-- so the right tool to use often is not going to be this lowest level tool. And indeed, C is not a language that most of you are ever going to use, or should necessarily use again. And in fact, a little secret-- the only time I use C myself is pretty much between September and December of every fall semester. And that's because we use it as an opportunity to teach the fundamentals of programming, and with it computer science fundamentals, data structures, algorithms and the like-- but very quickly will you see now that the syntax and the ideas underlying C are so wonderfully transferable to more modern higher level languages, like PHP and Python and Perl and Java and Objective-C-- actually, not so much Objective-C-- but Swift, these newer languages that many of you will then dabble with you final project. So without further ado, let's actually use PHP to solve some problems. Recall that early on, last week, we just used CS50 IDE, we wrote a dinky little program that just said, "Hello world." And then I saved it in a file called hello.php. And then I ran this command. And why? In English, what's going on here? What was I doing when I ran this command? Yeah? AUDIENCE: There's some function PHP that reads what's in-- understands that. DAVID MALAN: Good, there's some function PHP-- and let me be more specific, there's a program called PHP, a.k.a. An interpreter, that understands the contents of hello.php, and interprets it top to bottom, left to right, and does what those commands say. The commands in hello.php, of course, is just source code-- functions and variables and loops and the like, that we ourselves have started writing in PHP. But unlike C, which is a compiled language, PHP you just write it, and run it. You skip that middleman step of converting it to zeros and ones, and then running it. And so what is an upside of this? Why are we skipping the step? Why do more modern languages tend to skip this step? What was the benefit? Or just intuitively? Even if we've not written much PHP before, what's beneficial about not compiling your code do you think? No? Not committing? Scratching your head? Yeah. AUDIENCE: More dynamic. DAVID MALAN: More dynamic? What you mean? AUDIENCE: [INAUDIBLE] DAVID MALAN: OK, good, so depending on the input, you don't have to compile it each time. And it really is as simple as that-- what is the point of continuing to compile your code? This is just a step that's making-- this is requiring, for the past several weeks, twice as many steps as just running your program. It's been useful in seeing that you see some error messages and so forth, but it's still just an annoying step. And so programmers realized over time, why don't we start writing languages that don't need that fairly mechanical step, so that can just write your code and run it. But what was the price that we saw we paid last week, with one particular example? Yes? Speed. So [? what's ?] interpreters a little slower, in that zeros and ones are nice and fast for a computer to understand, because the Intel CPU, or whatever it is, just understands what's going on with those patterns of bits. Whereas an interpreter is a program that really has to read the Ascii source code that you have written, and convert it, so to speak, or figure out how it converts ultimately to zeros and ones. So it just takes a little bit of a performance hit. So it's a bit of a trade-off. Now if we do this over here, let me go ahead and do an example as follows. If I go in here, new file, I'm going to save this again is hello.php. And now I'm going to go ahead and say, "print hello world"-- and recall that I can use print, I don't have to use print-F. And now down here, if I do PHP of hello.php, huh-- I don't seem to have interpreted it. What did I do wrong? AUDIENCE: The angled bracelets. DAVID MALAN: Yeah, you need that angle bracket up top. So it's kind of annoying, but you get used to it quickly. If I have to write PHP code, I generally need to tell the program, or tell the interpreter, hey PHP, here comes some PHP code. And then for good measure, I would close this not with this, but rather with just question mark angle bracket, so that now down here, if I run this again, now I get the desired result. Now let's do a slight optimisation, just so that you've seen it before. This is kind of annoying that I have to run PHP space hello.php, because in the past I could just write dot slash program name, which is kind of nice. It's kind of a better user experience. So it turns out you can do this in PHP with the following-- I can use this fairly cryptic incantation at the top here, which is generally called a shebang, whereby this is a sharp symbol, so to speak, this is a bang or an exclamation point. And this now is the path to a program on a typical Linux system that is called environment, or env. And this line-- long story short-- line one just says, hey computer, find the PHP interpreter for me in the environment, find it in your memory, so to speak. And what's nice now, is that if I go down here, I can do dot slash hello dot php, or-- hmm. Permission denied. Well, you'll see even more of this with problem set seven, if you haven't already, with permissions. It turns out that I need to execute this command called [? chamod ?] for change mode-- a plus x hello.php. I need [INAUDIBLE] this one additional step which is telling my computer, make hello.php executable. And now watch what happens-- dot slash hello.php, it just runs. I don't need to specify the interpreter anymore. And I can make it even prettier, still, if I rename this thing. If I move hello.php to just Hello-- so notice in the top left, the program's name is indeed now just Hello. Now I can make it look like a C program, even though it's written in PHP-- or frankly any number of other languages. So marginal enhancement, no functional difference. But it's just a little curiosity now, so that you can write programs in any language, and the user doesn't have to know or care what those are. Well, let's look at a more compelling example now that I whipped up in advance. And this is called quote.php. And it's available online. And notice that it's pretty short-- but it's a command line program that's going to look up stock prices for me, which is actually going to be germane to problem set seven. So let's see what I'm doing. At the very top I've got the open bracket question mark PHP. Then I've got this line, whereby I am requiring a file called functions.php-- we're going to see more on this in a bit, but this is like C's version of sharp include, where you want to go include another file. PHP calls it require, though it also has an include function. And it turns out that function.php is just something I wrote before class. I put it in the same directory, because I wanted to factor out some code that we might want to use elsewhere. Meanwhile, you can probably infer what's going on here. This is a little different from C-- but what do I mean by ensure proper usage? Translate this more technically. Under what circumstances am I quitting the program, or exiting? Yeah? AUDIENCE: When you don't have two command line arguments. DAVID MALAN: When I don't have to command line arguments. And remember that one of those arguments is the program's name itself. And the second is going to be another word I type after the prompt. So just like C, this is my way of checking, did the user cooperate and run the program as I intended? Now, there's something a little different with C-- first of all we have this dollar sign, and what does a dollar sign denote in PHP? Just a variable. That's all-- just a variable followed by whatever you want to actually call it. Notice there is something missing from my PHP program, just like it was missing last week, versus C, which is what? A types, but also something else. There is no something function-- main function. There's no main function. You just start writing your code without having to worry about a fairly arbitrary convention of naming some default function main. So arg C is just really a global variable that the interpreter makes available to me. Now, this is interesting. So look up stuff. Dollar sign stock is on the left, that's my variable. On the right hand side, there's apparently a function in PHP called lookup that I'm passing my last command line argument to-- whatever the word is. And we'll see how this works in a moment. And then lastly I'm reporting the price. I'm printing out one share of such and such. And remember, this is the way in PHP-- a way in PHP-- where you don't have to do the dollar sign S anymore. You can just use curly braces and plug in some variable. You don't have to worry about using printf in the same way. And as an aside, when you put a variable inside of double quotes like this, you are using a fancy technique called variable interpolation. It just means plug the variable in here. And as an aside, some of you who come from other programming backgrounds, you may not use single quotes around strings to do this. You must use double quotes for variable interpolation to work. Otherwise you'll literally see those curly braces. So lastly, let's go ahead and run this. Let me make my terminal a little bigger. Let me go ahead and run inside of my quote directory. [? CDsource ?] [? AM ?] [? quote ?] PHP quote dot PHP, and I'm going to search for something like GOOG, which is its ticker symbol, and one share of its new name, Alphabet Inc, cost $717, as of today. All right, if we want to run this again, anyone have another stock ticker they want to look up? Microsoft I think is this one, MSFT-- $53. I think Yahoo is maybe that. And Facebook is that. So what is this program doing? The magic seems to be embedded in that lookup function. So let's take a quick look. It turns out that doesn't come with PHP, it's in functions.php. And we won't go through this in great detail, but notice the operative word here is that on line six of functions.php-- I literally say function. I specify the name of my function. I then specify any arguments, or parameters, I want that function to take-- no types. And then I implement it. And I'll wave my hand at the implementation, since it's fairly advanced right now, but we'll see it again actually in a week in problem set seven. But I can clean this up, too. I also included in today's code a version of quote, which has no dot PHP file. Because what is presumably at the top of the program called just quote? That so-called shebang-- the fairly cryptic incantation that says find PHP and then run it on my code here. All right, so that brings us to where we left off last time-- albeit with some more advanced examples. Any questions thus far about PHP or what we're doing? No-- all right. Yeah? AUDIENCE: Inside the HTML files, do you-- [? do you ?] [? just call it ?] a [INAUDIBLE] PHP file? DAVID MALAN: Good question. In a web context, which we're literally about to transition to, you don't use the so-called shebang at the top, because the web server-- often a program called Apache or Microsoft IIS, Internet Information Server, or any number of other web server software, knows that when it sees a dot PHP file, that it should run the interpreter on it. It doesn't look at that first line. So this first line trick is just when you're writing command line programs-- which we won't do super often, but it's our way of bridging our C examples to now our PHP. So let's indeed bridge this world from the command line world to the web by doing the following. Let me go ahead and draw over here for just a moment. So if we have a web server, or rather if we have my laptop over here, which I'll draw like this. And here we have the internet in some form. And then over here, we have a server in a building-- this is how the internet works-- and in here is a server with some lights maybe. What's actually going on between these two connections? So in this building is a web server. That's just a computer that's running some operating system-- maybe the free software called Apache, which CS50 IDE is running. So you can actually think of this building as being the building in which CSt0 IDE is stored. That's where all of you have accounts, where all of you have your own web server running, all of you have your own unique URLs, as we started to discuss, and you'll see more in P. set six. Here's my laptop somewhere else on the internet. And so when I visit a URL that belongs to me, that internet traffic is going over to the server, the server's receiving an HTTP request-- like a get index.html and it's replying to that web page. So that's the general paradigm. Whereas everything up until now today, everything was happening only in the confines of this building. I was using my laptop, but I was connected to CS50 IDE, so all of those programs I was running was inside of that server, itself. But now, let's start reusing PHP to write some actual programs that are served up by a web server. And to do this, I'm going to go into a whole bunch of examples that introduce this idea here. So this is kind of a fancy way of describing a programming paradigm. And in fact, as you exit CS50 or work on final projects, or take some follow on class, you'll start to see that the world-- especially having grown up with languages like C that are super low level-- realize that there's better ways of writing software. There are certain patterns you can follow, certain ways of organizing your files and ways of naming your functions, so that long story short, the world has come up with a whole bunch of acronyms and names for ways of programming. These are just techniques you might use. And one of them is called MVC, for Model View Controller. And this is just, for now, an overly complicated way of saying how you should lay out a PHP-based website, in our case. How do you organize your files, how do you organize your logic, in a way that makes it easier to write more complicated websites? And indeed, we'll quickly get there with p-set seven. So in the world of MVC, you're going to see that our code can generally be characterized as either model code, or controller code, or view code. And I'm going to oversimplify it as follows-- the controller is the brains of your program, it's where all of the interesting logic happens. So everything we've been writing thus far in class, is kind of like controller code-- it's controlling your program, your loops, your conditions, your functions and variables and all that. Views, now, are going to be a little more obvious in the world of the web. A view is the aesthetics of your website. It's what the user sees-- the images, the HTML tables, the HTML tags, and all of that, all of the fluffy aesthetic stuff that isn't that hard to write, but is just what you're generating, is the so-called view, the aesthetics. And model, ultimately, is going to be database stuff-- which we'll start diving into all the more this Wednesday. So controller is the logic, view is the aesthetic stuff, and model is going to be where we store our actual data. So let's look at this more concretely with the following example. I'm going to go into my directory here of today's source code-- all of which is available online. And I'm going to go into version zero. And here is-- let's call it the version zero of CS50's website. There's not much here at all. It's a very simple web page that's probably using what HTML tags-- just guess from past examples? What's that? H1-- probably for that big bold title, that logo up top, CS50. And what else is at play? Yeah? AUDIENCE: Unordered list. DAVID MALAN: Unordered list-- so the UL tag and maybe a couple of LI tags. And if you don't remember these, it honestly doesn't matter. These are fluffy sort of implementation details of HTML that you quickly look up and you're back on your way. We'll focus more on the programming ideas that are the juicier pieces. So let's just take a quick look at the HTML-- and indeed if I open up the view source here, yup, that's exactly what's going on here. There's an UL tag. Nested inside of that is to LI tags. And then I borrowed the URL of the actual syllabus here. And then in the lectures.php is apparently another dynamically generated page that's going to have, let's see-- ah, the first two weeks of lecture. So week zero and week one, let's look at this-- if I view page source, also super simple. These are leading to two pages called week0.php, and week1.php. So consider now what's happening. When I click on week0.php, my laptop is making a request for week0.php. The web server, a.k.a., CS50 IDE, is receiving that virtual envelope. It's seeing a message like, get week0.php. It is then interpreting the file, top to bottom, left to right-- the file called week0.php-- and spitting out the results. So inside of this file, week0.php, must be the controller logic that is generating this HTML, and we'll soon see that. But for now, let me click on week zero, and now we have Wednesday and Friday, and now we have the slides slowly from week zero. And you might recall this from way back when. So that's all this website is doing. So let's consider how it's doing this. I'm going to go back into the source code here, in CS50 IDE, and I'm going to open up index.php. At the top of this file is a bunch of comments. And then in the middle of this file, it turns out, is no PHP code whatsoever. Because if you don't have any of the open bracket question mark PHP tags, you're free to just put HTML. Because what the PHP interpreter is supposed to do, is when it reads this file-- top to bottom, left to right-- it only interprets code it sees between those angle brackets question mark. And anything else that it doesn't recognize as PHP, it just spits out. And HTML Is among the stuff it will just spit out. So this file could have been called index.html, but I'm naming everything dot PHP as a stepping stone. Lectures.php-- similarly underwhelming, it's just some HTML. Week0.php, similarly just some HTML. But now let's put on the proverbial engineering hat, and consider how we can improve this. It's not hard to do this, but I kind of devolved into copy and paste. And in fact, if I make week two, you know what I'm probably going to do? I'm going to go to week1.php, I'm going to highlight everything. I'm going to copy it, paste it into a new file called week2.php, tweak some URLs, and be on my way. So based on what we've seen in C already, this doesn't feel right, hopefully. Copy, paste rarely the right solution. So what can we start to do to improve this? Where are the opportunities for better design? By the time I get to week eight, it's going to be really annoying if I want to change the font of every one of my pages, or if I want to change the structure of the layout. So where's the opportunity for better design? Well, let's consider what's shared across all of these files. Here's week one, here's week zero, here's lectures.php, here's index.php-- what is the same and what is different, roughly speaking, in each of these files? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: OK, good. So there's a pattern, surely, whereby every time I choose lecture I, I should be generating a very similar looking page. And so perhaps I can leverage the fact that really, we deliberately numerically indexed our lectures-- if I can put even more words in your answer. And what is the only thing, really, that's changing between week one-- and let me scroll down so it's roughly in the same place-- so here is week zero, roughly at the top. Here is week one, week zero, week one, week zero. OK, literally if you know no program whatsoever, this is now just like a pattern matching game. So what's different? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Good, so the title is changing, ever so slightly. Zero is going, of course, to one. Same thing's happening in the H1 tag. And we don't quite see it as easily, because the URLs are a little long. But those URLs are changing slightly. But what's not changing is, dare I say, most of the contents of the page-- the HTML tag's the same, the head is the same, the title is almost the same, the body is the same, and almost everything else is the same except for those little tweaks. So how can we go about factoring some of this out? Well let me propose exactly that in the next version. So here in version one, I have the exact same files, plus a couple of others. Here's index.php-- and even if you've never seen PHP before, what am I probably doing to solve this problem-- based on what you see here? Yeah, is that a slight commitment? No? Yes, go on. AUDIENCE: [INAUDIBLE] DAVID MALAN: Yep. AUDIENCE: [INAUDIBLE] DAVID MALAN: I need you to speak just a little louder. AUDIENCE: [INAUDIBLE] DAVID MALAN: OK, good. And I think-- it was hard to hear you-- but I think what you're getting at is that the tags that were common up top, and the tags that were common on the bottom, have now been factored out, or relegated to what files? Header.php and footer.php-- and we're going to make some tweaks to address the concern you just raised about the numbers changing, for instance, if I heard you correctly. But that seems to be the gist of it. If there was a huge amount of redundancy at the top of the page, and a huge amount of redundancy at the bottom, let's literally just highlight and cut that content out, put it in a separate file-- just like the idea of CSS, where we factored out very similar aesthetics, put it in a separate dot PHP file, use the require mechanism-- which is like C sharp include-- which is essentially like saying go grab the contents of header.php, and copy and paste them here. But what this means is that now in index.php, I have those two lines. In lectures.php, I also have those two lines. In week0.php, I also have those two lines. So now, if I want to change the title of all of my pages, or I want to change the fundamental structure, I can change it now in just one place, or two places-- header and footer, respectively. Now the code's starting to look a little more cryptic, right? But if you think about what the page is doing-- if I'm requesting week0.php, just like on the drawing over here-- when week0.php is requested, what does that mean? Literally, this file is requested by the browser. The web server-- a.k.a. CS50 ID-- grabs this file, week0.php, and reads it top to bottom, left to right. On line one, it immediately encounters open bracket question mark PHP, require header dot PHP, and so what the PHP interpreter does-- that's built into the web server, because we preconfigured it for you-- it automatically goes into header.php, copies the contents, pastes them here. But then the interpreter encounters question mark close bracket, so it's all done thinking. Now it just blindly spits out lines two through seven, because it's just raw HTML. Gets to line eight, and does that same magic again-- opening the file, grabbing the contents, and requiring them or pasting them right then or there. But I just alluded to a bug. This is a partial step backward, because if we look in header.php, I've kind of cut a corner. What feature did I give up in order to gain this arguable better design? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, I kind of cut a nontrivial corner. You pointed out that what was changing was the title, the number in the title, and the number in the H1. So my solution was, OK, let's just rename the page, and not deal with that problem whatsoever. So that's a partial step backwards for sure. But what is noteworthy here is that what I have done is otherwise factored out all the common stuff. And in footer.php, notice I factored out all of that, albeit lesser, common stuff. So I need to somehow now be able to take another step forward, and fix that title issues. So let's do that. Let me go into my second version here, which, again, has the same files except for one new addition. And it's a little more verbose, but let's see if we can tease apart what's going on here. So instead of requiring header.php, and footer.php, I seem to be only requiring one file-- called, of course, helpers.php. And let me stipulate now, what's inside of helpers.php is just a bunch of functions that I wrote, just like before. But I called it helpers.php. Now apparently, in line three and 10, I'm calling two functions-- render header, render footer. Those don't come with PHP, I wrote those myself. And I put them in helpers.php. Now, we've only seen this syntax once, and it was super brief. But this is apparently an argument to render header, the function. Why do I know that? Well here's a close paren, here's an open paren. And of course, just like in C, anything between those parentheses is an input-- or an argument to the function. What is the data type of this argument, based on what I've highlighted? What do those square brackets indicate, based on last week? Yeah, it's an array-- specifically an associative array. And this syntax admittedly is a little funky, but this is just passing in one key value pair. The key is, quote unquote title, and the value is CS50. If we had done this in C, it might instead look more like this, just quote unquote CS50-- or actually it would be curly braces, or something like that in C, where the key is zero, and the value is CS50. But again, in PHP, even though the syntax is, again, a little weird, it allows you to pass in words instead of numbers to associate keys with values. So what does this all mean? If I go into helpers.php, let's look at this function. renderHeader.php, rather renderHeader is my function, and I know that because I see the function keyword here. This is new from C-- it apparently takes an argument called data-- but I could have called this anything, but I called it data, just to be a little clean-- and just take a guess, especially if you've programmed in some other higher level language before, something above C, conceptually. What does equal open bracket square bracket probably mean? Or what might it mean? We've not seen this in C. Yeah? An empty array. Specifically, this means that if the user does not call renderHeader with an argument, I'm still going to have an argument called data, but its default value is going to be an empty array. So it's just a nice convenience. I don't have to yell at the user, or say you used my function wrong. I can just give the user a default value, if I don't particularly care. Now this function, I'm going to wave my hands at. But this extract function allows us to pass these variables in data into header.php in the following way. And this is the last piece, I think, of funky syntax. Here is my new version of header.php-- it used to say, literally, open bracket title CS50, and that was it. And same thing for the H1. Now it apparently says something pretty funky. And let me simplify this for a moment as follows. This is what I've changed my title to be. However, it's getting a little ugly to constantly open brackets with PHP, and then use the print function. It turns out that PHP has a shorthand notation for this, which is just an equal sign, which is technically a function called echo instead of print, but it's the same thing, effectively. That just looks better. It's just a syntactic sugar, if you will, that makes my code look a little better. But it turns out, and we'll see this again before long, we have to call this annoyingly long function called HTML special chars in PHP, because it turns out there are certain inputs that the user might give us, or that users might give us, that are going to break our site. But we'll see that next week with JavaScript. But for now, just know that this file, headers.php, simply takes the title that I passed in, it make sure it's safe to be injected into a web page, and it spits it out as my title and as my H1. So if I go into this version now, notice that lectures has its title back, week zero has its title back, and indeed, the HTML I'm generating is identical to what my first version was-- except for my whitespace, because I've started formatting my code a little differently. But I've generated all the code I care about. So let me pause for just a moment and see if there's any questions or confusion I've created. All right, so let's twist a little harder here to see if there's an opportunity for improvement. Helpers.php also had this function, called renderFooter. And what's noteworthy about renderHeader, and renderFooter? And again, for today's purposes, know that the extract function is just my way of passing arguments into header.php and footer.php. Sorry? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, I only changed the require line. So literally, I've committed the sin of copying and pasting, yet again. It's not a huge number of lines, but come on-- if I'm copying and pasting everything just to change one little word, and the one little word that Alan points out is footer here, versus header here. Otherwise, everything is identical, except for, of course, the function's names. So what could we do better? Well let me open up this version here, whereby in helpers.php, why don't I just get a little smarter about this? Write slightly more complicated code, but call it render? So what have I fundamentally changed? It takes an argument now-- two arguments, data still. And then what's the first name probably being used for, based on what you're reading here? Even if some of the syntax is still new. What is dollar sign template? Sorry? AUDIENCE: Header or footer. DAVID MALAN: Header or footer. So apparently, I decided that if the only thing that's changing is what template I want to print-- and by template I mean this is blueprint for code that I want to output, but I want to plug in some values-- so if it's only header or footer, why don't I parameterize that and call the argument dollar sign template? And then this funky syntax allows me to create a path in a variable here. So dollar sign path is a variable. What does this syntax do, if you're familiar? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Exactly. If template is, quote unquote, header, or if template is, quote unquote, footer, that line there that I've highlighted, line eight, is simply taking that name, like header, and concatenating it with dot PHP. So we didn't have this operator in C. This dot operator is an amazing thing in PHP-- if you're familiar with JavaScript or Java, you can use the plus sign to do concatenation. In C, it is a pain in the neck-- and I'm so sorry, in p-set six, you're going to have to do this-- it is a pain in the neck to concatenate strings. Why? Well, because if you've got a string that's this long, and another string that's this long, you can't just plug them together. What do you instead have to do in C? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: You have to malloc memory, or use an array on the stack. And you actually have to make that array big enough to fit this plus this, plus the backslash zero. Then concatenate them together using stir cat or manually with a for loop, or any number of techniques. And we show you a couple in p-set six. It's a pain in the neck. And this is truly what I mean about this versus this-- like C versus PHP. You just get so much more functionality for free, so that you can focus, ideally, on the fun part of coding, the project you want to solve, rather than the low level minutiae. So this just generates header.php or footer.php based on which one I call. And indeed if I go into index.php, notice all that's changed-- Instead of calling render header or render footer, I'm calling render, followed by the name of the template that I want to do. And you'll see this, too, in problem set seven, whereby we allow you to use the same function to make bunches and bunches of different web pages. So rather than dwell too much more on those details-- which you'll see again in problem set seven-- let's look at now the beginning of a solution to a more interesting problem. Thus far, nothing we've done has saved data. In fact, the only time we've ever saved something we've done in this class is when we had a very simple demo awhile back, whereby we used file IO in C, and I think I typed in my name, and Hannah's name, and Maria's name, or maybe Andy's name, and then we saved a CSV file-- comma separated values file. And we used fopen-- I think we used fprintf as I recall, and we saved a file. Now, that is the simplest form of a database. If you want to make a website for the Frosh IMs program, whereby freshmen can register for a sport, you ideally want to do something with that data. Last week, we did nothing with the data-- we just said, you are registered, not really. Or maybe I emailed the proctor, and that was it. But it would be nice if I could give that proctor a CSV file, like an Excel file. Or better yet, it would be nice if I could put those users' names and dorm names and all of that into a database that just lives on forever, until I choose to delete the data. A database that allows me to query information. And indeed, that's what a database is. We introduce today, and next week, too, a technology called SQL-- a Structured Query Language, which is another language. It's essentially a programming language, but for databases. And a database for now, just think of as a super fancy version of Microsoft Excel, or Google Spreadsheets, or Apple Numbers. It's generally a program that allows you to store a whole bunch of data in rows and columns, quite like you might in Excel. But what's nice, especially if we're not super familiar with Excel, what SQL allows you to do is query this information by writing lines of code where you can, even if your database has a million rows in it, you can find things super fast. In fact, Excel is particularly bad at large data sets. And in fact, up to a few years ago, turned out Excel would only allow you to store up to 65,535 rows of data-- which sounds like a lot, but at the time I was a grad student, and I remember tripping over this because I was generating CSV files for my research and I wanted to analyze them quickly by just opening up in Excel. Of course, my computer just crashed, because I had more than 65,000 rows. But where did the 65,535 come from? What was Microsoft doing, presumably? If you're good with your powers of two? Yeah, they were using a 16-bit value to represent the row number. And two to 16 is 65,536-- minus one, because if you zero index means that was the most number of rows I could have. And it was just a design decision. By saving 16 bits, they limited me to 16,000 rows, instead of 4 billion, which I could have had ideally. But for now, we're going to introduce this more in a web context. And what's nice about SQL is that even though it's pretty powerful and pretty sophisticated, it really boils down to four key operations, four key functions, if you will-- select, for retrieving data, searching for data; delete or deleting data; insert for adding rows to the database; and updating. So if you have ever used Google Spreadsheets, Apple Numbers, Microsoft Excel, you have executed, most likely, all of these operations as a human by just using your keyboard and mouse-- inserting data, using your eyes to select or search for data, or update data, or delete data. So what does this mean? Well, pre-installed in CS50 IDE is a program called MySQL. It's a free, open-source database that's super popular. Facebook, for instance, uses it to this day, among other tools that they use. And a lot of very popular websites use it in large part because it's fast, and because it's free. Though certainly alternatives exist. And some of you might dabble with alternatives for final projects. This is a screenshot, meanwhile, of a web-based tool called phpMyAdmin. It is a coincidence that this web-based tool is also written in a language, PHP, but what it's meant to do is give us a web-based interface to a database. Because MySQL typically is something, historically, you would interact with only with a command line. And it would be super annoying and arcane to have to type textual commands to select data, insert data, and delete data. So some people on the internet wrote a web-based program that just let us manage the data in our database. It's like double clicking on Excel, and running a web-based version thereof. And what you're going to use this for ultimately next week, not in p-set six, but is to build something called CS50 Finance, which is going to have a database of users, with user names and passwords, dollar amounts that they have in their bank accounts. It's going to be something you use to store the symbols and the quantities of stocks that users have bought using virtual dollars that you'll give to them. And it's going to allow users to register for your site, so that even your friends can tune in to your website and actually register, log in, and play around and try to find fault in your code, and try to find bugs in your website. And they'll simply register by adding themselves, effectively, via code you write to your database. For instance, this is a quick screenshot of what a database might look like. This was from one of last year's solutions-- this is like a mini Excel file, stored in our database, stored in this software called MySQL. On the left hand side, I've apparently given every user a unique number. In the second column, I've given everyone a user name-- my own among them. And on the right hand side, I've given them a hash. Now this is actually a password, but it's not a plain text password. It's a encrypted password, if you will, or a hash password. Which we'll come back to before long. But if you've ever read an article about how your password at some bank or some website might have been compromised, it can generally mean one of two things. So this is just an excerpt of six users. All of you now can figure out via hacking or cracking what our six people's passwords are. But if you've ever gotten an alert or an apology from a company or website saying, sorry, a hacker broke into our database, you should probably change your password, what might that mean? Well, one, could mean the company has been more moronic, and has been storing your password in a column like this, unencrypted. Which means the adversary, who stole the database, literally knows your username and password. That's the worst possible scenario. And as you'll see in p-set seven, so easy to avoid. There is absolutely no excuse for that form of stupidity in today's internet. Two-- and we'll find some articles to testify the fact that this still happens, nonetheless-- two, maybe the adversary stole this version of the database. Which is still kind of bad, because now they know that I have six customers, I know the user names of those six customers, and I know the encrypted versions, or the hashed versions, of those six customers' passwords. But any of you who might have done [? Hacker 2 ?] where you cracked passwords, or took a look at that version of the problem set, why is it still a little worrisome if the adversary knows your hash passwords? AUDIENCE: Because they could enter the whole dictionary into the hash function. And if your password is a dictionary word, [? they can just match-- ?] DAVID MALAN: Exactly, the adversary can just write code, like some of you did for [? Hacker ?] 2, whereby you iterate over all of the words in the dictionary, or all possible combinations of A through Z and one through nine-- which sounds like a lot, and it is. But for a computer, it's pretty darn fast. And in fact, that was the point of [? Hacker 2, ?] was to take stuff that literally looks like this, and reverse engineer what it actually was. So we'll look at how we can store this more efficiently. Turns out, thankfully in MySQL, there are going to be data types. And one of the fun parts about database design, to be honest, is actually deciding for yourself how should you represent the data? Should you represent a phone number as an int, like a big number, or a long? Or do you actually do it as a sequence of chars? And there can be very non-trivial impacts of this. In fact, one of the earliest, fun germane stories is when Mark Zuckerberg was building Facebook, it was originally written in, and still is largely written in PHP. And one of the biggest challenges they faced early on was scaling. When they kept adding school after school after school, to my knowledge, one of the original solutions was essentially to copy and paste some of the databases and some of the code, so that Harvard was running on its own server, and MIT was running on its own server. And this was why, for some of you who might recall, you couldn't have friends in other networks. You probably don't have friends at MIT or Harvard 10 or so years ago, but you couldn't span networks for partly that reason. And one of the biggest challenges for Mark and for companies like Facebook is actually handling hundreds and thousands and millions of requests per second. So the things we'll start talking about this week are really going to be germane to writing good software, and popularly successful tools that can handle lots of users. So we'll talk about things like indexing and searching, but that is it for today. We will see you for more on Wednesday. [MUSIC - "SEINFELD" THEME] DAVID MALAN: You can to it, and subtract from it. And you don't have to stick with some pre-determined amount of memory. Well, what's that going to be called? SPEAKER 1: Well, what's going on? SPEAKER 2: What do you mean? He's giving a lecture. DAVID MALAN: And we can use a function called malloc to memory-- SPEAKER 1: Why aren't his arms moving? SPEAKER 2: Well that's-- you know, that's normal. It's just like he has just big sausages hanging there. SPEAKER 1: That's normal? SPEAKER 2: Yeah, I think we just assume he accidentally replaced his deodorant with superglue.