[Week 8] [David J Malan] [Harvard University] [This is CS50.] [CS50.TV] Welcome back. This is CS50, and this is the start of week 8. A couple of opportunities this week, among them this talk here, at which some food will be served. For more details check out the slides that are online. And also another event this week by our own Thomas Carriero. He's one of CS50's former head teaching fellows who is now at Dropbox, and he's the guy who hooked us up with the you know what, so if you want more of that head to their talk this afternoon for Dropbox and more. CS50 lunch is this Friday. Do join us if you are able, 1:15, as usual, at Fire and Ice. And now we dive into something called Seminars. Our CS50 Seminars, recall, are these optional classes led by the teaching fellows and course assistants and friends of the course, folks from a group on campus called ABCD, which is a group of technophiles on campus, as well as a group called HCS, the Harvard Computer Society, undergraduates who are similarly interested in computing. This year's roster of seminars includes seminars on Android and iOS and JavaScript and PHP, Unix, Vim, and more, so realize that these seminars are coming up. If you'd like to RSVP for any of them head to that URL there. We will then post on the course's website the times and places once they are finalized. But know there's 5 year's worth of prior seminars available online, many of which are still very much current in terms of technologies you might want to play with for your final projects, so head there for some available videos thereof. CSS, those of you who are familiar with CSS already, what is it in a nutshell? What is CSS? It's cascading style sheets, and what does that mean? What does that do for us, CSS? All right, let's warm up with an easier one, HTML, hypertext markup language. What does that do for us? Anyone at all? It's getting really awkward asking these questions. HTML, hypertext markup language. Yes? No? [inaudible student response] Okay, good, it allows us to mark up text to display in a web browser. It's not a programming language. It's indeed a markup language, which means it instructs the browser how to display information, so the simplest incarnation of this as we've seen is something super simple like boldfacing, open bracket b closed bracket says make this text bold, and that's actually just one of many ways in which we can do that, and indeed, these days a better approach to stylizing your web page, making things bold and italics and centered and justified and the like, is not done via HTML tags alone but rather with a technique called CSS, cascading style sheets. This is a language unto itself. It too is not a programming language but— everyone, this is Dan, who keeps joining us today. Some technical difficulties. Not a problem. CSS allows us to stylize a page by setting what are called properties, so let's take a look at this by way of some basic examples. Let me go into the appliance today. I have the source 8 Monday directory in here, and I'm going to go into a directory called CSS where we have a whole bunch of files waiting for us right here, and in this folder we have, for instance, search0.html from last time. Now, recall with search0 we left on this note by sort of implementing Google or really just the front end for it a week or so ago, and notice that we had some new tags there. We had h1 for a big, bold heading, form, which allowed us to actually have an HTML form for user input. Action, what was the meaning of an action attribute on the HTML form tag? What was the meaning of this, action? I'll just do this today. Action is the destination to where the form is going to be submitted. The fact that that says action = "google.com/search" means that when the user clicks the submit button or the equivalent whatever form fields they filled out are going to be sent not to our server or our appliance but rather to that specific URL at Google. And the method it's going to use is called get, and get, for now, is just a technique for passing information along to a web server by way of the URL, so let's take a quick look back at how this works. Notice that there's an input whose name is q whose type is text and then a second input of type submit whose value is CS50 Search, and indeed, if we open up this file here, search0.html, it's a super simple form, and if I search for something like computer science and then hit enter or click on CS50 Search notice that what happens is beyond getting to Google I've specifically ended up at this URL at the top, google.com/search?q=computer+science, and computer science is obviously what I typed in. The + just means that's where a space character was, and it's done by the browser just to make sure that there's no confusion and white space in the actual URL. And then q, of course, is the parameter name. We haven't seen how we, the programmer, can actually access q yet. We can assume that Google knows what to do with this here, but we'll get there in due time today. But let me take a look instead at search1.html, which looks a little different because I decided that this form here was just a little lame. I mean, it's at the top left. There's really no aesthetics to it, and so I want to stylize this a bit more like Google, whose homepage, recall, even though you might not visit it that often, looks like this today on Halloween. If we instead open up version 1 of this file, search 1.html, I've centered it. Still pretty ugly, but at least now I've started to control the aesthetics of this page, not just the marking up thereof. Let's take a look at search 1, and there's really just one difference here, which might jump out at you, or maybe not, but what's the one line or snippet of difference? There's this style attribute, so it turns out that in HTML most elements, most tags can have a style attribute on them, and inside of that style attribute is a quoted string, and that quoted string is CSS. You can put cascading style sheet in there by specifying it as a property name followed by a colon followed by a value. This is kind of an unfortunate design decision some years ago that CSS is a language unto itself, but syntactically it's very different from HTML. In this case, we see that inside of my webpage, which is written in HTML, I have CSS inside of these quotes, and the convention for CSS is that you have what's called a property followed by, again, a colon, followed by the value of that property, so there's no equal sign. There's no additional quotes. It's just this colon separated key value pair, and text line does exactly what it says. It aligns the text in the body of the page, which is really the guts of the page, in the center. Okay, the end result then, to be clear, is this. Not all that sexier, but at least it's centered and a little more like the real Google. But what if I instead open up version 2 of this and point out down here a new tag altogether? Now in the head of my page, which previously only had which tag in all prior examples? It just had this, the title. A moment ago the head tag looked like this. Now instead it has a style tag inside of it, and this too, I apologize, syntactically looks very different from HTML, but you get used to it, whereby inside of the style tag I can now factor out what was a moment ago an attribute, the style attribute, and I can put it at the very top of my page. Why? Well, this is a step toward cleaning things up, much like in writing C code we would sometimes write functions to factor out common functionality. It's just a little cleaner to start factoring out things like the aesthetics to one central location rather than having it all intersperse throughout your HTML. This too does what it says, even though there's a bit of new syntax. This here is a selector, and body just means select the body element and apply the following properties to it. Well, the property is exactly the same. For good measure I've added a semicolon at the end, which tends to be convention, and I've wrapped this whole property in curly braces because I could actually have different things here. I could actually say something like color: blue; Now this too is not going to be a step toward anything all that prettier, but if I now go back to version 2 I've at least now made the body of my page's text all blue. The button stays the same because that's an input. It's not pure text. But everything else that is text, like CS50 Search up top, is in fact blue. Again, all we've done now is remove from the body tag, notice, the style attribute, and we've factored it out here. This isn't a huge improvement, but if we take this one step further notice what we can do in this third version here. In search3.html the webpage is almost identical except for what new tag now? Link, so this one is not very aptly named because you're not linking in the sense of a clickable hyperlink. Rather, you're sort of doing the equivalent of #include in C whereby the link tag with an href attribute and a rel attribute says go ahead and copy paste the contents of a file called search3.css right here, essentially. It doesn't quite do that, but that's the spirit of it. It says go open that file, search3.css, and treat it as though the user had typed it right here in the head of the page just like I did in the previous example. Search3.css, meanwhile, is pretty simple. It really just contains exactly what was a moment ago in the style tag, but I've factored it out here to its own file. Even though we haven't spent much time at all in HTML or web programming just intuitively what's the motivation, perhaps, for factoring out this small snippet even of CSS into its own file and then including it with this link tag here? [inaudible student response] Okay, it's easier to read in the sense that you have your CSS in a CSS file. You have your HTML in your HTML file, so it's more readable in that sense. What else might be compelling? Yeah. [inaudible student response] Yeah, so you can include it many times, so right now we're doing these basic examples with individual files, but suppose you're actually making a real website like you will for pset 7 or your final project perhaps, and you want to have multiple webpages, as is certainly common on the actual World Wide Web, and it would be kind of lame to have to copy and paste the same blue color and the same text aligned center in every one of those pages. Rather it makes more sense to factor out, much like we've done in C with the .h file, put it in one central place, in this case search3.css, and then allow any file in your website to actually include that file by way of this tag here in line 16. As is typically the case, we started with version 0, which kind of works but isn't necessarily the best, and with each step, search 1, search 2, and now search 3 we've taken these baby steps toward designs that are a little cleaner and are more preparatory for more complex pages that we might do down the road. Let me open up one last example here just to show an even more stylized page, but first let's look at the HTML. This is search4.html, and notice that structurally it's almost the same except for the introduction of a new tag, div. Div is a tag that introduces a division of the page. You can think of this as an invisible rectangle. It sort of creates a swath of area in the webpage that you can stylize all at once. What I've done here is as follows. Inside of my body tag, which has been there all along, I'm saying create a division of the page here via lines 45 through 47, and that means essentially give me an invisible rectangle along the top of the page. Then give me a second rectangle, albeit invisible, below that, and identify it by the name content, and then lastly, give me a third division of the page at the bottom called ID. We'll see why I've done this in just a moment, but conceptually I have a header division. I have a content division, and I have a footer division of the page even though these are just in markup. The user is not going to see 3 rectangles, but sort of structurally there behind the scenes they're actually present. Now, who cares? Why actually do this? Everything else on the page is the same as we've seen before. Here's my form. Here's my input, my input, a line break and so forth. Here's an image, though, so we'll see where this came from in just a moment. Here's a footer, which is new, just because I wanted to introduce some more content here. If we scroll up notice that ID of this div is header. The ID of this div is content, and the ID of this one is footer. And as the name suggests, when you have an ID attribute in HTML, by definition it must uniquely identify one of the elements, one of the tags in your page. The burden is totally on you to remember that you have a header ID already. You have a footer. You have a content ID already. The computer is not going to figure out what an available ID is for you, so you could accidentally give 2 tags an ID of header, and that would just be wrong. You have to keep in mind what you have created, but once you've done that notice what we can do here. I can now specify in my style tag at the top or equivalently in my CSS file, if I was still using that version, I can say #header, and what that means is that whatever tag in this webpage has an ID of header and #, just by human convention, represents ID. The sharp sign or pound sign represents ID. Header is the name that I gave it. This means apply this CSS property to whatever tag in this page bears an ID of header. Same deal here. Apply this property, which happens to be the same, to any element whose ID is content, and then down here notice I got a little fancier with footer. Any element whose ID is footer, of which there can be just one by definition, go ahead and make its font size smaller, its font weight bold, its margin 20 pixels. What does that mean? It's just a margin on the top, the bottom, and the left and the right. This means give me a 20-pixel invisible margin around it just to push everything else away from it a little bit, like you might do in Word, Microsoft Word or Pages or the like. And then text align center. Let's see the end result, and then we'll go back up to the one remaining snippet of CSS there. This is version 4, our last for the search examples, and it's much, much sexier. Now, in fairness, I just Googled "google font logo generator." And that allowed me to create a GIF, an image format, which looks like that there. In fact, you can do this too. We have "google fonts logo generator." Let's see if we can do this. Okay, I think this is the website I used. We can say Ec 10, for instance, and make them their own. You can play with this all day long and then right click on it and then download the actual GIF, which is all that I did. And indeed, that's why in my HTML, recall, over here I had an image tag, which we saw briefly last week whose source is logo.gif. And what again was the motivation for having this alt attribute, this alternative attribute? Yeah. [inaudible student response] Good, so 2 reasons really, if the browser can't pull up the image because you have a slow network connection or the image is corrupted or something like that at least the human can see "CS50 search," and then also for accessibility reasons. If you have a user who is blind and is using a screen reader and therefore obviously can't see images they can at least hear text if their computer speaks it to them. In general, this is best practice when it comes to the accessibility of pages so that even users in that situation can hear or see, so to speak, what it is that's on your page. There's one other thing that I did here which is a little interesting, and we'll see more about this in problem set 7 via one of the shorts led by one of the teaching fellows. But #content refers to the tag whose ID is content, but then there's a space character, and then there's the word input. Well, what's interesting about CSS is that you can refer to tags in a page sort of hierarchically, and what this snippet of CSS means is find the tag whose ID is content, and then apply the following properties to all of the input tags that are descendents of content, that is that are indented inside of it. Indentation, again, is only important to the computer, not to the human, but by convention we indent things as we go deeper into a page, so this means apply a margin of 5 pixels to any input element that's somewhere inside of or nested inside of the element whose ID is content. Who does that apply to? Well, there's actually just these 2 guys here. Notice that inside of the form there's 2 inputs, as there's been for all of these examples. But notice that those 2 inputs happen to be nested inside, albeit a little deeply, a couple layers of indentation, inside of the tag whose ID is content. What does this mean? If we go to the browser here you can see ever so slightly— let me zoom in—that there is a bit of padding between the button and between the text field. Let me temporarily turn that off. Let me go up to my CSS, and let me go ahead and just change this margin from 5 pixels to 0 pixels. Let me go ahead then and save the file, go back to the search engine and reload, and watch the middle of the page. Everything got compressed together, and when I first whipped this example up I thought that looked stupid with the text field and then the button immediately below it. I wanted to pad it a little bit, so I introduced margins. What we won't do in lecture is go through the several dozen CSS properties that exist because, again, there are things like font size, font weight, margin, text align, and a few dozen others, and we'll refer you in problem set 7 to various tutorials online and references that allow you to pick these things up. But what's really important at the end of the day is to understand how these things are applied. Again, if we have the style tag inside of which can go the selectors, the sort of identifiers that specify to whom do you want to apply these properties, and then you put the properties as key value pair separated by a colon and then ended with a semicolon, or you can rip all of that out and put it in a separate CSS file unto itself. All right, any questions on the concepts or the big picture of CSS? You'll again see more of it in pset 7, but we'll keep it generally pretty simple. No? All right. It's time for an actual programming language, and we'll come back to a little bit of CSS in the form of an example. PHP is actually a wonderfully accessible language in that it is syntactically almost equivalent to C. In other words, if you know C, you know for the most part PHP, at least syntactically, even though there are some new features and some new concepts we'll have to look at. But for the most part, now that we transition from C to PHP most of the new stuff is really in the big picture, how you use a language to program on the Web as opposed to at the command line or in a blinking prompt as we've been doing thus far. For reference, especially with pset 7 and the final project onward, do take advantage of this URL here if you'd like to read up on the formalities of PHP. It's actually like a free online textbook effectively, and you'll also find that what's really nice about PHP is that there are hundreds of functions that come with it, whereas in C you didn't necessarily have access to more functions than were in the math library, the CS50 library. In PHP and a lot of modern languages, Python and Ruby among them, you get access to so many more functions, which means you get to write a lot less code because you can stand on the shoulders of other people who have already written certain things for you. Let's take a quick tour of the syntax of PHP and then write a few examples. What's nice about PHP first and foremost is there's no main function. If you want to write a program in PHP you just start writing code, and you don't have to worry about main. There's no int. There's no return. There's no argv, argc that's required when you write the program. Rather you can just start writing code, and this is in part because PHP is what's called an interpreted language. C was compiled, and it was compiled in the sense that you start with source code, run it through Clang, which is a compiler, and eventually after some number of steps you get object code, 0s and 1s. PHP and Python and Ruby and Pearl and others are different types of languages in that you don't compile them. You don't go from source code to 0s and 1s. You just run the source code, and you run the source code by writing in a usual text file, ending in .php in this case instead of .c, and what the program does on your computer is it literally interprets your code line by line by line. In other words, rather than write a program and run the program directly you instead write a program with a file ending in .php. Then you run an actual program called php.exe, if you're on Windows, or just PHP if you're on Mac OS or Linux, and you provide as input to the PHP program your own source code, and its purpose in life is to read your code top to bottom, left to right, and do whatever you've told it to do. Let's see what this is going to mean syntactically. In PHP we have conditions. This slide is identical to what you saw back in week 1 because syntactically conditions, ifs and else ifs and else in PHP look exactly like this. When it comes to boolean expressions they're going to look exactly like this. When it comes to anding things together as booleans it's going to look exactly like this. Switches look the same, and you get the added benefit in PHP that switches in C could only switch on a char or an int. You could not switch on a string value. In PHP you can actually have an expression that is a variable whose contents are a string, and you can actually do string comparison in the real intuitive way, not pointer comparison, in order to decide whether to do case i or j or something else. We'll see that potentially before long. Loops too wonderfully are the same. For loops have an initialization, a condition, and some number of updates. While loops also exist in PHP. Do while loops also exist in PHP, and arrays exist in PHP, but here's where the syntax starts to get a little different, but the concepts are the same, and the concepts really are the same as they were in Week 0 with Scratch. First and foremost is the $ sign. This was a design decision in PHP whereby any variable in PHP by design starts with $ sign. There's no more X, Y, Z. It's now $X, $Y, $Z just because. It's something to keep in mind, and now on the right-hand side this looks similar to an array, but we're using square brackets here. In PHP and in JavaScript, as we'll eventually see, to declare an array you do open square bracket and closed square bracket, and then you have a comma separated list of values, whether ints or strings or chars, whatever you want, inside of that expression there. Now, how did we do something like this in C? What was the syntax for statically declaring an array of known numbers? It was curly braces, so minor difference here, but in both PHP and eventually JavaScript it just uses square brackets, so really the only interesting detail here is the $ sign for the variable name and also the square brackets, and there's one curious thing that's been omitted as well on the left-hand side of the = sign. What's missing that we've been requiring for weeks now? Yeah. [inaudible student response] The size, so there's no mention of the size of the array. Frankly, there's no mention of square brackets on the left side of the = sign, and what else is missing from the line? Yeah.>>[inaudible student response] The type, so what's interesting in particular about PHP is that it is not a strongly typed language as C is, and that's strongly typed in the sense that you must say char, you must say int, you must say float. Anytime you want a variable you have to tell Clang what its type is. PHP is a little lazier. It's loosely typed in the sense that you can have floats and chars and strings and ints and so forth, but the language itself doesn't really care what you put inside of a variable. You do not have to inform it in advance what data type is going in a variable. It's entirely up to you, so this is nice in that you don't have to worry as much about data typing and worrying what your arguments are and so forth. This also means eventually functions in PHP are going to be able to return either an int most of the time, and maybe once in a while they'll return a bool, a boolean false, for instance, to signify that something went wrong. This gives us some upsides, but it also will make us sort of by design a little bit lazier when it comes to data typing. What else is there to keep in mind here? Variables look quite like this, so $s = "hello, world." That's perhaps inferable from the previous example, and we have another type of loop. This one we'll actually see once in a while since it's quite handy, a foreach construct. In this case, the foreach loop takes inside of its parentheses 3 words typically, $ something first, which is what array do you want to iterate over the members of, then literally the keyword as, and then lastly, another variable name that you get to choose. It can be foo, bar, or element, and what this construct does is if the $array contains 10 elements on every iteration of this array—sorry, on every iteration of this loop the variable called element is going to be updated to be the first element in the array, then the second element in the array, then the third element of the array, thereby obviating the need to do the slightly annoying square bracket notation and $i in order to index into an array. PHP does all of that work for you and on every iteration just hands you the next element from the array without you having to know about or care about its numeric index location. And then lastly, for now, there's one other feature of PHP that's going to be hugely useful, especially when we start programming on the Web, and that's known as an associative array. The arrays that we know thus far as of 20 seconds ago and for the past 8 weeks are numerically indexed arrays, sort of traditional arrays where the indices are ints, 0, 1, 2, all the way on up. Associative arrays are a lot more powerful. They allow you to have arbitrary keys, arbitrary indices and arbitrary values. Whereas in a traditional array it's 0, 1, 2, in an associative array you can have an index or a key of foo whose value is bar. You can then have another key whose name is baz and whose value is qux. Again, stupid computer science generic variable names here, but the point is that this array does not have bracket 0 or bracket 1. It's instead going to have bracket foo and bracket baz. This is a lot more versatile in that we're going to be able to associate words with other words, keys with values completely arbitrarily, and we're going to be able to get those values back in constant time because underneath the hood what an associative array really is is a hash table. Recall that a hash table allows you to put in some input like put in the word David if you want to insert David into some kind of dictionary, and then you get back some value typically. In the case of speller, true or false. David or whatever word is in or is not in the dictionary. An associative array is really just a hash table, but it's a much more user friendly incarnation of it. As we'll see, it's going to allow us to do some things very, very easily. Let's take a look at some basic PHP examples and see what we can do with this language. Let me go ahead and open up in our source directory today a file called hello1.php. This file is more comment than it is actual code, so let me actually remove all of the comments from the file and present to you perhaps the simplest PHP program right here. 5 lines, and some of those are white space, so notice some key differences here. The file is called hello1.php. The very first line, though, is , means that's it for my PHP code. Let's see how to run this. I'm going to go back to my terminal window here. I'm going to go into my PHP directory. Notice that we have a whole bunch of files, the first one of which is hello.php. Let me go ahead and run this, hello1.php, enter. Permission denied. Okay. How have we fixed things like this in the past? What's that?>>[inaudible student response] We need read and write, but let me do ls -l. Remember this somewhat cryptic output whereby hello1 seems to be readable and writable by me but readable by everyone else. It turns out this actually isn't a step in the right direction. The difference, again, with an interpreted language is you don't run the program directly. You instead run an interpreter and hand it the code that you've written so it can interpret it line by line. In this case, the interpreter or program I actually want to run is literally called PHP. Somewhere on this hard drive of the appliance there is a program someone else wrote called PHP, or on Windows php.exe. What I'm going to do here is I'm going to actually run PHP but give it as a command line argument the code that I wrote, and then I'll zoom out and hit enter. It runs my program for me, top to bottom, left to right. Let me go ahead and open up a slight variance of this. In hello2.php notice that this too is mostly comments, so let me get rid of those as a distraction, and what's clearly different now about this file? There's this new line, somewhat cryptic at the top. In line 1 it's #!/bin/php. Bin is a convention on Linux and Mac OS for binaries, so /bin means this is a folder containing a bunch of binaries that is programmed, one of which is PHP. The #! is nicknamed shebang, which is the quick way of saying it, and what this means is that when you run this program now there's a hint at the top of the file that tells the computer what interpreter to use. It gets a little annoying if you had to tell your users and your customers "Hey, we wrote this program called hello1.php." All you have to do is forever run PHP and then the name of this program. Frankly, it would just be nicer to run hello1.php, and indeed, we can if we do the following. Let me go ahead and do ls -l, and notice in hello2 it's still just read write and then read read, so I cannot yet do this, hello2.php. But we introduced this ever so briefly last time, the chmod command. If I do chmod a+x, which means all plus executability, and then hello2.php and then do ls -l again notice what changed. One, Linux is showing me the file name in green to convey the idea that it's executable, but more importantly, on the left-hand side notice that the bit representing x for executable has now been set. What this now means is I can run ./hello2.php as usual, hit enter, and because of the shebang at the very top of the file that's a hint, again, to Linux that says use this interpreter to run this file. Don't worry about forcing the user to actually type it. And what's nice now is it's kind of irrelevant to my customers or my friends what language I wrote this program in, so I can go ahead with mv and rename this thing to hello2, for instance. And now if I do ./hello2 and zoom out my program continues to run. These file extensions are a human convention that's necessary for something like Clang and Make who look for them. But for PHP, I could call this file extension anything I want. I could trick the world into thinking that I'm really good at Ruby, and I could write hello2.rb and then run this, and voila, now I have the Ruby version, which is a complete lie. But the file extensions are meaningless if the file is executable and has this special hint at the top of the file. Now, as an aside, let me show you quickly version 3, which is sort of a useful trick to know. In hello3 I did something slightly wrong that I'll update the source code online. In version 3 it turns out that on most Linux computers there's a program called env for environment, and what you can do here is if you have no idea where PHP is installed on the local hard drive, because indeed it could vary based on the computer that someone is using, env just says run env, which is on most systems, and figure out where PHP is. Just a common trick so you don't have to worry about finding out where a program is. But if you do care to find out where a program is and you haven't cared thus far you can use the which command. Let me zoom out and type which php, and notice it tells me it's actually in usr/bin/php. It's kind of a lie. It's also in bin. It's just showing me the first hit. If you ever wondered where Clang is, which Clang, that's in usr/bin/clang, which make, usr/bin/make, and what that means is all this time you could have been typing usr/bin/clang enter to run Clang, but it's kind of tedious to do that, so some folders like usr/bin and bin are assumed to be defaults so the computer knows to look in them for you. Any questions on writing a super, super simple Hello World program in PHP and then running it? Because now we'll start to introduce more compelling syntax. All right, here we go. These programs we've seen actually all of them before. If I open up, for instance, let's do beer1.php, we won't go through several versions of this, but what I did was I sat down and poured it or converted my C code to PHP code here. Most of the top of the file is comments up here. It turns out there's one new function we need called readline. GetString, recall, from Week 0 onward was a CS50 thing. PHP comes with its own user-friendly function called readline that takes 1 argument which specifies the prompt that you want to show to the user, and what readline does is it returns whatever the user types in. In this case, I'm declaring a variable called $n. I'm storing in it the return value of readline after prompting the user with this string. Just to back up, to actually run this thing, let me go ahead and run php beer1.php. How many bottles will there be? Let's just do 2 this time. Enter. That's all. The program is functionally identical to the C version from weeks ago. But syntactically let's see what's different. After I get an int from the user notice that I'm doing some error checking, and if n is less than 1 I quit and I print out a sorry message to the user and exit with 1. This too is a little different. In C what did we do? In C we returned 1. In PHP you exit with 1, which frankly is I think a little more intuitive because you're literally exiting the program. All right, and then down here the annoying song is identical syntactically except for the variable, so down here in line 24 onward notice my for loop is almost the same, but I have $ in front of i and n, and what is also missing from line 26 that we've had in the past when declaring a variable i? There's no type. It is incorrect in PHP to say int. You simply do not need to do that. The computer, the interpreter PHP is smart enough to realize that if you put a number in $I it will treat it as a number for you. And then down here we plug in $i, $i, $i - 1. All of that is the same, and then down here we do a "Wow, that's annoying" printf and then exit(0). Again, the takeaway here is that even though we're going to spend relatively little time on PHP, certainly versus what we did on C, it's almost the same, and so what we'll do today and next week and beyond is focus really on some of the new ideas. Just to see that one other thing does translate over from C, this was a super simple program we did in Week 1 or 2 that cubed a value. But what was interesting at the time about this program is that it introduced the notion of a custom written function that we ourselves wrote. The syntax in PHP is almost the same. Here's my program up top. Notice again absent is any notion of main. I start writing code, and this is what's going to get executed by the interpreter. I print out x is now 2, presumably. Then I claim cubing... Then I call the cube function and pass in $x and assign the return value to $x. Then I claim that it's cubed, and then I say this, which hopefully will say x is now 8. The syntax for the function in PHP is ever so slightly different. Again missing is the return type. Again missing is the return type and also missing is what other type? [inaudible student response] Well, okay, that's good. Let's come back to that in a second. We don't have, for instance, int here. We don't, for instance, have int here because, again, in PHP you simply don't need to and should not do that, but rather there's this new keyword called function. In PHP it's almost a little clearer because when you want a function you literally say function, you give it a name and then a comma separated list if any of its arguments. No need to say void or anything like that, and then return is the same, $a * $a * $a. What is also missing? Sammy pointed this out here. At the top of the file completely absent in PHP also is a prototype. This too is by design. Languages and interpreters like PHP are smarter than C ever was in compilers like Clang. Recall that Clang, if you didn't tell it that cube exists, if you didn't tell it that printf exists as with a prototype or with a #include, well, it was going to yell at you and not even compile your code. PHP and more modern languages are a lot smarter when it comes to this. They will take it upon themselves to read through all of your code and then yell at you only if it finds cube nowhere. It doesn't matter if cube is at the bottom or the top or even in some separate file. PHP and similar languages are now smart enough to look ahead at everything before deeming you as having made a mistake. Where does that leave us? Let's do one last example here in conditions, and if I open up conditions2.php notice too syntax here is almost the same. I'm using readline instead of GetString, but that line is the same as before, "I'd like an integer please." I then have an if condition, an else if, and then an else, but functionally this program is also identical to what we did weeks ago, so if I run this thing, php of conditions2, and I give it a number like 23— I picked a positive number. If I give it -1 I picked a negative number. If I give it 0 I indeed picked 0. So who cares about all of this? Well, one of the fun sort of exercises here for me at least was to go back and see how quickly I could implement pset 5, the misspellings pset. Recall that there was this file called speller.c, and there was a file called dictionary.c. What I did was I kind of spent a few minutes and I converted the C code to PHP code, and we won't spend much time on speller because just like in pset 5 you didn't really need to spend much time on speller itself because your attention was on dictionary. Suffice it to say that if you read through speller, this file here, it's pretty much equivalent to the C code we gave you for pset 5. I've just added some $ in places. I've changed certain function names if they didn't exist in PHP. There's one additional thing here, preg_match, which is a little fancier way of doing something, but we'll come back to that eventually. But in short, speller is almost identical, and if you look at the very bottom what it eventually spits out is this here, words misspelled, words in dictionary, words in text. All right, so what's interesting now is the following. At the top of my file I am requiring dictionary.php. Just as C has #include PHP has a special function called require that pretty much does the same thing, require a file called dictionary.php. How can I go about implementing pset 5? Let me go ahead and open up a file here. Let me take a little reference here. And let me create a new file and start calling this dictionary.php. Let me put it in another folder so we can do this live. And now I'll zoom in. I'm going to start my PHP file with open bracket php closed bracket. And then in here there were a few functions I needed to implement for pset 5, so let me start implementing some of those, so function check, which had to take a word in as an argument. We'll do that and come back to it in a moment. There was function load, which took in what as an argument? Dictionary, so the file that I actually wanted to load. There was function size, which didn't take any arguments and there was function—what was the other? Unload, which didn't take any arguments either. These are the 4 functions that I would need to now implement in PHP, and what I'm going to do is go ahead and do this. A lot of you used a hash table in pset 5, so let me go ahead and create a hash table in PHP. Done. That gives me a hash table. Well, why? One, the variable is called $table, just to conjure up the idea of a hash table. The square brackets, though, recall, represent what? An array, but in PHP arrays don't have to be numerically indexed. They can also be associative arrays, which means you can have arbitrary keys and values. Much like in pset 5, those of you who did hash table implementations you probably inserted the word and then inserted it into a chain of linked lists, or you stored the value of true somewhere or something to that effect. You somehow remembered the fact that the word was there. For now, that's going to be my hash table, and so now to go about implementing the check function I just need to look inside of that hash table and see if a word is there. What I'm going to do is I'm going to say if— let's say isset, which is a PHP function that literally just means is the key set, so isset($table[$word], and if so return true. That's it. That's pset 5 in PHP. Well, in fairness, okay. Else return false, so it's not there. What's really going on here? Well, if table—or hash table here more generally— is an associative array that means you can index into it with a word like "word," and you have to get back some value. We're kind of getting one step ahead of ourselves. It would be kind of nice if we actually loaded the file first, so load isn't quite as simple, but let me go ahead and whip up a really quick implementation of load. Let me go ahead and say words gets file dictionary. The file function in PHP opens a file and returns to you an array of all of the words in that file, just hands them to you. That was a big pain too, wasn't it? Now foreach, this is our new construct, foreach ($words as $word). This loop is going to start iterating over the array words and assign to the $ word variable each word in the file from the first to the second to the third to the fourth all the way so I don't have to do the annoying [i] notation and the like. And what I'm simply going to do for each of these words is store it in my table by indexing into table and then doing true because to remember that a word is in my dictionary all I really have to do is kind of flip a bit and say this word in my hash table is there, true. And if it's not there, I don't have to explicitly put false, otherwise I'd have to put false for all possible words in the universe. It suffices for me just to set an index value to true if a word is actually in my hash table. Now, I'm cutting a couple of corners here that I'll wave my hands at for now, but now the load function is done. I load all the words from the file into an array. I iterate over that array, and for each word in the array I plug it into my hash table with 1 line of code. This is fun. You know how we can implement size now? Well, size is always pretty easy, in fairness. Here we can just do return count of table. That's pretty easy too, count the number of things in the table. That's actually kind of not the most efficient. I should probably have a variable called size so we can do it in constant time, but that's pretty easy. Oh, and then unload, if we really want to be anal here we can say that's how you unload something. You just set the variable equal to an empty array, and it gets rid of everything that was there. No need to call free. Again, I've cut some corners, and I apologize for assigning problem set 5 perhaps in C, but if we now go ahead and run this, I'm going to actually run the version that I wrote in advance just so that I didn't make any syntactical mistakes whatsoever. Let me go ahead and run speller. The usage is the same. Here is a dictionary file which just contains the word foo. Here is a text file which just contains foo bar. Let's spell check this, so speller, using this dictionary file on this text file. There's one misspelled word, bar, and voila. Done with pset 5. Let's take a 5-minute break here, and we'll come back and more on PHP. All right, we are back. Let's do—hate me for a while. Let's now actually see if this wasn't a positive actually implementing this thing in PHP. Granted, it took 45 seconds to implement. But let's go ahead now and run things. Let me go ahead and run a C version of speller, and we'll run it on one of the biggest files, which is the King James Bible. And that here is in—let's go into our C folder, speller on King James the 5th. A lot of misspelled words. Okay, so that's the output you probably got even if the times are a little different, if you got everything working correctly, and so time in total to spell check the King James Bible was .38 seconds, so pretty good using that implementation. Now let me go into the PHP version, which we just wrote. Let me run speller on King James. Whoops, ignore that error. I'm in the wrong directory. Speller on King James the 5th. Almost done. Okay, the astute observer will realize that was more than 3 seconds there. That is the true running time. It turns out that it takes time to spit lots of text out because of buffering issues, but long story short, that was 3.15 seconds of machine time, CPU time, versus what was it a moment ago? Like .3. I mean, it's an order of magnitude slower, so where is that ridiculous slowdown coming from? Well, as has been the case with most any design decision we've made in the class over the past 9 weeks there's almost always this tradeoff. Sometimes between just space, sometimes between space and time, space, time and development effort, and indeed here, even though we saved a huge amount of time, maybe potentially 10-20-30 hours of development time implementing the spell checker by whipping it up in just 45 seconds with this language the price we pay is that it's an order of magnitude slower as a result, and this is generally the case with most any interpreted language, PHP, Python, Ruby, Pearl or others whereby if you're going to run it through an interpreter and have it read your code line by line, top to bottom, left to right, that middleman is going to take some time of its own, and what you were feeling here in the 3 seconds as opposed to .3 seconds is the fact that there is this middleman who has to literally interpret our code line by line, and God forbid if you're inside of a loop with a huge file containing hundreds of thousands of words. That overhead is going to add up and add up and add up and add up. For a tool like this it's probably not the best language to use for implementing a spell checker if immediacy is of interest to your users and to you. But the luxury we have in a moment is if you use a language like PHP or a lot of interpreted languages in the context of the Web, for that matter, you have the benefit that the internet is a lot slower than most computers. You have a GHz CPU in your computer, 2 GHz, maybe even more these days. But the reality is on the internet there is a high amount of latency whereby for a browser to talk to a server, even though we saw last week that that's pretty fast, half a millisecond or so, that too adds up, and if you're downloading things like an image or a Facebook photo or getting instant messages over Facebook chatter, Gchat or the like, all of these round-trip times between the browser and the server start to add up, which makes your particular choice of language in many cases not all that relevant, so you're fine using a slightly slower language like PHP or Python or Ruby but for which there are huge upsides to you and your colleagues and your friends because you can implement things so, so much faster. And moreover, you have much less risk of certain security flaws. There's no pointers in PHP. There's no seg faults that you can easily induce in the same way you could in C. With C you're super close to the hardware. With PHP and similar languages you're sort of higher level, so to speak, with a lot of defenses between you and what's actually going on inside the machine, and it's just a tradeoff. We have gotten to the point of having these more modern, high level languages like PHP because of the lessons learned in languages like PHP in C. But if you don't understand what's been going on underneath the hood all this time you certainly can't make the right design decisions, and certainly when it comes to working at a place like Facebook or Google or any of these places that are increasingly playing with large data sets even if you go back and do premed and are working with some MD on some large data set involving patients and doctors and the like using the right tools is hugely compelling because otherwise your analysis of some data set might take seconds, or it might literally take hours. This is just one example, not to frustrate you with how much more effort it was in C but to help you appreciate that when you do implement something in C you really understand, or in theory, really understand how everything is or should be working, and you have almost full control over what's going on underneath the hood, and with these higher level languages you have to relinquish more control to the people who invented them and are subject more to their design decisions than yours. But if we take for granted that the performance isn't quite as important on the Web because of these other issues, just network speeds are a little slower than CPU speeds anyway, so we can sort of afford to use a slightly slower language if the upsides are we can develop things 10 times faster or even more. Let's see how we can start using this. Let me go into a folder among today's examples called frosh.ims, and this was actually personally motivated by the fact that the very first thing I wrote for the Web years ago after taking CS50 and CS51 was a website for the Frosh IMs program, freshman intramural sports, which at the time this was enough years ago that at the time there was no website for the program, even though there was a Web, and instead there was a proctor in Wigglesworth whereby if you wanted to register for volleyball or soccer or whatever you would fill out a piece of paper. You would then walk across the yard. You would then knock on their door and slide in their door or hand to the proctor a piece of paper with your name on it, whether or not you want to be a team captain, what sport you wanted to do, and what dorm you were in. It was sort of an old school way of doing things, and this was a prime opportunity to automate a lot of this process. You just go to the website. You type something in. You get an email confirmation, and boom, you're done. This was the very first thing I did, albeit in a language called Pearl, but it's relatively easy to do in PHP, and this is sort of representative of the problems you can start solving when you can express yourself programmatically and don't have to rely on things like Google sites or Excel or tools that are handed to you. You guys now have the ability to do things like this. This is a super ugly version of a form, but let's just use it for the beginning of the conversation whereby this is roughly what the form looked like years ago for us to enable people on the Web to sign up for Frosh IMs. We asked for a name, a checkbox for whether or not they wanted to be captain, male or female, and then what dorm they were in, and then they would submit this form. Let's first look underneath the hood at the HTML that represents this webpage. Let me go into froshims0, and as an aside, for pset 7 I'm taking for granted the directories and the folders that I'm putting things in. We'll walk you through exactly where stuff has to go in the appliance, which mod commands you have to run, so don't worry about all of the stupid details syncing in from the get-go here. All right, here's froshims0.php. Let me scroll down, and what's curious here, this is a PHP file, but what's inside of it, clearly? It's a whole lot of HTML, and indeed, PHP's origins really were for being a Web-centric language. A moment ago we used it to implement the beer example, the conditions example, the hello example, and that's fine. You can use PHP as a scripting language where a script is really just the nomenclature given to a quick and dirty program or something that you write in a scripted or more generally an interpreted language. PHP is super useful for that because you've seen how quickly relatively we can whip up programs in PHP. But it was really designed to be used for the Web, and designed for the Web in the sense that notice up here at the top of the file I do begin with