[Section 8 - More Comfortable] [Rob Bowden - Harvard University] [This is CS50. - CS50.TV] These week section notes are going to be pretty short, so I'm just going to keep talking, you guys are going to keep asking questions, and we'll try to fill up as much time as possible. A lot of people think that this pset isn't necessarily difficult, but it's very long. The pset spec itself takes an hour to read. We give you a lot of the SQL you could possibly need to use. We walk you through a lot of it, so it shouldn't be too bad. Has anyone started or finished? It's the last pset. Oh, my God. Usually there's a JavaScript one after this, but calendar change things makes everything 1 week shorter, and we no longer have a JavaScript pset. I don't know how that affects whether JavaScript is going to appear on the exam or Quiz 1. I imagine it will be something like you need to know high-level things about JavaScript, but I doubt we'd just give you straight JavaScript code since you haven't had a pset in it. But that will be stuff for quiz review next week. Section of questions. A lot of this stuff is somewhat poorly worded, but we'll discuss why. Unlike C, PHP is a "dynamically-typed" language. What does this mean, you ask? Well, say goodbye to all of those char, float, int, and other keywords you need to use when declaring variables and functions in C. In PHP, a variable's type is determined by the value that it's currently holding. So before we type this code into a file called dynamic.php, PHP is dynamically typed. That is true. I disagree with the fact that that means we're saying goodbye to char, float, int, and other keywords. The exact difference between dynamically typed and the alternative, which is statically typed, is that dynamically typed, all of your type checking and stuff happens at run time, whereas statically typed it happens at compile time. The word static in general seems to mean compile time things. I guess there are other uses for it, but in C when you declare a static variable, its storage is allocated at compile time. Here, dynamically typed just means that-- In C if you try to add a string and an integer, when you compile it, it's going to complain because it's going to say that you can't add an int and a pointer. It's just not a valid operation. That is another thing that we'll get to in a second. But that sort of checking, the fact that it complains at compile time, is static type checking. There are languages where you don't need to say char, float, int, and all of those things, but the language can tell from the context of the thing what type it's supposed to be, but it's still statically typed. So if you take 51, OCaml, you never need to use any of these types, but it still will at compile time say you can't do this because you're mixing an int and a string. Dynamically typed just means that sometime during run time you're going to get a complaint. If you have also used Java before, in general, almost any C-type language is going to be statically typed, so C, C++, Java, all of those are generally statically typed. In Java when you compile something and you're saying string s equals new something that isn't a string, that's going to complain because those types just don't match up. That's going to complain at compile time. But it also has some dynamic time things like if you try to cast something to a type that's more specific than its current type, there's nothing it can do at compile time to check whether that cast is going to succeed. Java also has some dynamic type checking that as soon as it gets to that line of code when it's actually executing, it's going to do the cast, check if that cast was valid in the first place, and if it wasn't, then it's going to complain that you have an invalid type. Dynamic type checking. Type this into a file called dynamic.php. Dynamic.php. I'll unzip that formatting. We have a variable, we set it to the integer 7, then we're going to print it and %s-- Oh, we're printing the type of it, so gettype is going to return the type of the variable. We're just printing the type over and over again. We just php.dynamic.php. We'll see that it changes from integer to string to Boolean as we go through. In C there is no Boolean data type, there is no string data type. There's char * and Boolean just tends to be int or char or something. In PHP these types do exist, and that's one of the big advantages of PHP over C-- that string operations are infinitely easier in PHP than C. They just work. So we come back here. We ran dynamic.php. This tells the PHP interpreter, called php, to run the PHP code in dynamic.php. If you have any errors in the file, the interpreter will tell you! The interpreter, this is another big difference between PHP and C. In C you have to compile something and then you run that compiled file. In PHP you never compile anything. So the PHP interpreter is basically just reading this line by line. It hits var = 7 then it hits printf then it hits var then it hits printf and so on. There is a bit of compiling it does, and it caches the results so if you run the script later you can do some, but basically it's a line by line sort of thing. That means that a lot of the optimizations that we get in C, like compiling, it's just generally the compiler can do a lot of tricks for you. It can take out unused variables, it can do all of these sorts of things, it can do tail recursion. In PHP you're not going to get that advantage because it's just going to start executing line by line by line, and it doesn't really recognize these things as easily since it's not 1 big compilation pass over the thing and then execution; it's just line by line. So that's the interpreter. Back to our dynamic typing: pretty cool, eh? You definitely couldn't do that in C! Now, see if you can figure out the type of each of the following values. See this for reference. So 3.50. What type do you think that's going to be? Here are the types we have. We have bools, integers, floating points, strings, arrays, objects, and then resources, which is kind of vague. I think there's actually an example here. Then there's NULL. NULL is a special type. Unlike C where NULL is just a pointer with address 0, in PHP, NULL is its own type where the only valid thing of that type is NULL. This is much more useful for error checking. In C where we had this issue where if you return NULL, does that mean you're returning a NULL pointer or using NULL to signify error or all of that confusion we had at one point. Here, returning NULL generally means error. A lot of things also return false for error. But the point is the NULL type, the only thing of the NULL type is NULL. Then callback is like you can define some anonymous functions. You don't have to give the function a name, but you won't have to deal with that here. Looking at the types that they do expect us to know, what do you think the type of 3.50 is? >>[student] Float. Yeah. So then here, what do you think the type of this is? >>[student] Array. Yeah. The first one was float, the second one is an array. Notice that this array is not like a C array where you have index 0 has some value, index 1 has some value. Here the indices are a, b, and c and the values are 1, 2, and 3. In PHP there is no difference between an associative array and just a regular array as you would think of it in C. There is just this, and underneath the hood a regular array is just an associative array where 0 maps to some value the same way a maps to some value. For this reason, PHP can be pretty bad for really fast code/benchmarking things since in C when you're using an array you know that accessing a member is constant time. In PHP accessing a member is who knows how much time? It's probably constant if it hashes correctly. Who knows what it's really doing underneath the hood? You really need to look at the implementation to see how it's going to deal with that. So then fopen. I think here let's just PHP manual fopen to look at the return type. We see here you can look up pretty much any function in the PHP manual and this is sort of the man page of PHP. The return type is going to be resource. That's why I looked it up, because we didn't really define resource. The idea of resource, in C you kind of got a FILE* or whatever; in PHP the resource is your FILE*. It's what you're going to be reading from, it's what you're going to be writing to. It's usually external, so it's a resource you can pull things from and throw things to. And finally, what is the type of NULL? >>[student] NULL. Yeah. So the only thing that is NULL is NULL. NULL is NULL. One feature of PHP's type system (for better or for worse) is its ability to juggle types. When you write a line of PHP code that combines values of different types, PHP will try to do the sensible thing. Try out each of the following lines of PHP code. What's printed out? Is it what you expected? Why or why not? This fact about PHP is what makes it what we call weakly typed. Weakly typed and strongly typed, there are different uses for those terms, but most people use weakly typed and strongly typed to mean this sort of thing where ("1" + 2); that works. In C that would not work. You can imagine this not working. A lot of people mix up dynamic typing and weak typing and static typing and strong typing. Python is another example of a language that's dynamically typed. You can throw around types in variables and it's going to determine at run time any error checkings. In Python it's going to execute this and it will see ("1" + 2); and this will fail because it says you can't add a string and an integer. In PHP, which is just as dynamically typed, this will not fail. Weak typing has to do with the fact that it does things with types that don't really make sense necessarily. So ("1" + 2); I can imagine that being the string 12, I can imagine it being the string 3, I can imagine it being the integer 3. It's not necessarily well defined, and we're probably going to see here that when we print ("1" + 2); it's probably going to end up being different than printing (1 + "2"). And this tends to be, in my opinion, for the worse. Here we can try these. Another little trick about PHP is you don't need to actually write the file. It does have run this command mode. So php -r, then we can throw in the command here: "print('1' + 2);" and I'll throw a new line. This printed 3. It looks like it prints 3 and it's the integer 3. So now let's try the other way around: "print(1 + '2'); We get 3, and is it also going to be integer 3? I honestly have no idea. It looks like that is consistent. There is never any chance of it being the string 12 or anything like that because PHP, unlike JavaScript and Java too, has a separate operator for concatenation. Concatenation in PHP is dot. So printing (1 . '2'); is going to give us 12. This tends to lead to confusion where people try to do something like str += some other thing that they want to add on to the end of their string, and that's going to fail. You need to do str .= So don't forget concatenation in PHP is a dot. Other things to try: print("CS" + 50); I've told you that there is no hope of this resulting in CS50 since concatenation is not +. What do you think this is going to end up being? I honestly have absolutely no idea. It looks like it's just 50. It sees the string, and I bet if we put 123CS-- It sees the first string, it tries to read an integer from it or a number from it. In this case it sees 123CS. "That doesn't make sense as an integer, so I'm just going to think of 123." So 123 + 50 is going to be 173. And here it starts reading this as an integer. It doesn't see anything, so it just treats it as 0. So 0 + 50 is going to be 50. This I'm assuming is going to do something similar. I'm thinking 99. Yeah, because it's going to take the first-- So 99. Here (10 / 7), if this were C, what would that return? [student] 1. >>Yeah, it would be 1 because 10 / 7 is dividing 2 integers. An integer divided by an integer is going to return an integer. It can't return 1 point whatever that would be, so it's just going to return 1. Here printing (10 / 7); it's going to actually interpret that. And this means that if you actually want to do integer rounding and stuff like that, you need to do print(floor(10 / 7)); In C it's probably weird that you can rely on integer truncation regularly, but in PHP you can't because it will automatically turn it into a float. And then (7 + true); what do you think that's going to be? I'm guessing 8 if it's going to interpret true as 1. It looks like it's 8. So anything we've done in the past 10 minutes you should absolutely never do. You will see code that does this. It doesn't have to be as straightforward as this. You could have 2 variables, and 1 variable happens to be a string and the other variable happens to be an int, and then you add these variables together. Since PHP is dynamically typed and it won't do any type checking for you and since it's weakly typed and since it will just automatically throw these things together and everything will just work, it's difficult to even know that this variable must be a string now, so I shouldn't add it to this variable, which is an integer. Best practice is if a variable is a string, keep it as a string forever. If a variable is an int, keep it as an int forever. If you want to deal with integers and strings, you can use varsint--that's JavaScript. Intval. I do this all the time. PHP and JavaScript I mix up everything. So intval is going to return the integer value of a variable. If we pass in "print(intval('123')); you get 123. Intval itself is not going to do the check for us that it's exclusively an integer. The PHP manual, there are just so many functions available, so here I think what I would use is is_numeric first. I'm guessing that returned false. That's another thing we have to go over is ===. So is_numeric('123df'), you would not think of that as is_numeric. In C you would have to iterate over all characters and check to see if each character is digit or whatever. Here is_numeric is going to do that for us, and it's returning false. So when I printed that, it printed nothing, so here I am comparing it to see, did you happen to be false? And so now it's printing 1. Apparently it prints 1 as true instead of printing true as true. I wonder if I do print_r. No, it still does 1. Going back to ===, == still exists, and if you talk to Tommy he'll say == is perfectly fine. I'm going to say that == is terrible and you should never use ==. The difference is that == compares things where it can be true even if they're not the same type, whereas === compares things and first it checks are they the same type? Yes. Okay, now I'm going to see if they actually compare to be equal. You get weird things like 10 equals--let's see what that says. So ('10' == '1e1'); This returns true. Does anyone have any guesses why this returns true? It isn't just about that. Maybe this is a hint. But if I change that to an f--darn it! I keep using double quotes. The reason the double quotes are yelling at me is because I've put this in double quotes. So I could escape the double quotes in here, but single quotes make it easier. So ('10' == '1f1'); does not print true. ('10' =='1e1'); prints true. [student] Is it hex? >>It's not hex, but it's close that it's like-- 1e1, scientific notation. It recognizes 1e1 as 1 * 10^1 or whatever. Those are equal integers. If we do === then it's going to be false. I actually have no idea if we do == what about (10 and '10abc');? All right. So that's true. So just like when you did (10 + '10abc'); and it would be 20, here (10 == '10abc'); is true. Even worse are things like (false == NULL); is true or (false == 0); is true, (false == []); There are weird cases of-- That's one of those weird cases. Notice that (false == []); is true. ('0' == false); is true. ('0' == []); is false. So == is in no way transitive. a can be equal to b and a can be equal to c, but b might not be equal to c. That's an abomination to me, and you should always use ===. [student] Can we do !== as well? >>[Bowden] Yes. The equivalent would be != and !==. This is actually brought up in the pset spec where a lot of functions return-- The PHP manual is good about this. It puts in a big red box, "This will return false if there's an error." But returning 0 is a perfectly reasonable thing to return. Think about any function which is expected to return an integer. Let's say this function is supposed to count the number of lines in a file or something. Under normal circumstances, you pass this function a file and it's going to return an integer which represents the number of lines. So 0 is a perfectly reasonable number if the file is just empty. But what if you pass it an invalid file and the function happens to return false if you pass it an invalid file? If you just do == you're not differentiating the case between invalid file and empty file. Always use ===. That's all of those. In PHP, the array type is different from what you're used to in C. Indeed, you may have already noticed this above when you saw that this is of type array. The bracket syntax is new as of PHP 5.4, which is the newest version of PHP. Before this you always had to write array( 'a' -> 1, 'b' -> 2. That was the constructor for an array. Now PHP has finally come around to the nice syntax of just square brackets, which is just so much better than array. But considering PHP 5.4 is the newest version, you may encounter places that don't even have PHP 5.3. Over the summer we ran into this issue where PHP 5.3 was what we had on the appliance, but the server that we deployed all our grade book and submit and all that stuff to was PHP 5.4. Not knowing this, we developed in 5.3, pushed to 5.4, and now all of a sudden none of our code works because there happened to have been changes between 5.3 and 5.4 which are not backwards compatible, and we have to go and fix all of our things that don't work for PHP 5.4. For this class, since the appliance does have PHP 5.4, it's perfectly fine to use square brackets. But if you're looking up things around the Internet, if you're looking up some kind of array stuff, most likely you're going to see the spell out array constructor syntax since that's been around since PHP was born and square bracket syntax has been around for the past couple months or whenever 5.4 came around. This is how you index. Just like in C how you would index by square brackets like $array[0], $array[1], $array[2], you index the same way if you happen to have your indices being strings. So $array['a'] and $array['b']. $array[b]. Why would this be wrong? It will probably generate a warning but still work. PHP tends to do that. It tends to just, "I'm going to warn you about this, but I'm just going to keep going "and do whatever I can." It will probably translate this to a string, but it is possible that at some point in the past someone said define b to be 'HELLO WORLD'. So now b could be a constant and $array[b] will actually be doing 'HELLO WORLD'. I think at this point, or at least our PHP settings, if you try to index into an array and that key doesn't exist, it will fail. I don't think it will just warn you. Or at least you can set it so that it doesn't just warn you, it just straight up fails. The way you check to see if there actually is such an index is isset. So isset($array['HELLO WORLD']) will return false. isset($array['b']) will return true. You can mix these syntaxes. I'm pretty sure what this array would end up being is-- We can test it out. Oh, I need PHPWord. This is mixing the syntax where you specify what the key is and you don't specify what the key is. So 3 right here is a value. You haven't explicitly said what its key is going to be. What do you think its key is going to be? [student] 0. >>I'm guessing 0 only because it's the first one we haven't specified. We can actually do a couple of these cases. So print_r is print recursive. It will print the entire array. It would print subarrays of the array if there were any. So print_r ($array); php.test.php. It does look like it gave it 0. There's actually something to keep in mind here, but we'll get back to it in a second. But what if I happen to make this index 1? PHP does not differentiate between string indices and integer indices, so at this point I've just defined an index 1 and I can do both $array[1] and $array['1'] and it will be the same index and the same key. So now what do you think 3 is going to be? >>[student] 2. >>[Bowden] I'm guessing 2. Yeah. It's 2. What if we did this is 10, this is 4? What do you think the index of 3 is going to be? I'm thinking 11. My guess as to what PHP does--and I think I've seen this before-- is it just keeps track of what the highest numeric index it's used so far is. It's never going to assign a string index to 3. It will always be a numeric index. So it keeps track of the highest one it's assigned so far, which happens to be 10, and it's going to give 11 to 3. What I said before, notice the way it is printing this array. It prints key 10, key 4, key 11, key d. Or even let's do-- I guess I didn't put a 0, but it's printing 1, 2, 3, 4. What if I switch here? Or let's actually switch these 2. Now it prints 2, 1, 3, 4. PHP's arrays aren't just like your regular hash table. It's perfectly reasonable to think of them as hash tables 99% of the time. But in your hash tables there's no sense of the order in which things were inserted. So as soon as you insert it into your hash table, assume there's no linked list and you could judge within a linked list which was inserted first. But here we inserted 2 first and it knows when it's printing out this array that 2 comes first. It does not print it out in just any order. The technical data structure that it's using is an ordered map, so it maps keys to values and it remembers the order in which those keys were inserted. Basically it's to some complications where it's annoying to actually-- Let's say you have an array 0, 1, 2, 3, 4, 5 and you want to take out index 2. One way of doing it, let's see what that looks like. 0, 2, 1, 3, 4. Unset happens to unset both variables and array indices. So unset($array[2]); Now what's this going to look like? 2 is just gone, so that's perfectly fine. More annoying is if you want things to actually be like an array. I'll put random numbers. Now notice my indices. I want it to just be like a C array where it goes from 0 to length - 1 and I can iterate over it as such. But as soon as I unset the second index, what was in index 3 doesn't now become index 2. Instead it just removes that index and now you go 0, 1, 3, 4. This is perfectly reasonable. It's just annoying and you have to do things like array splice. Yeah. [student] What would happen if you had a for loop and you wanted to go over all the elements? When it hit 2, would it yield ever? Iterating over an array. There are 2 ways you can do it. You can use a regular for loop. This is another intricacy of PHP. Most languages, I would say, have some sort of length or len or something indicating the length of an array. In PHP it's count. So count($array); $i++) Let's just print($array[$i]); Notice: Undefined offset: 2. It's just going to fail. This is the reason that, for the most part, you never need to iterate over an array like this. It might be an exaggeration, but you never need to iterate over an array like this because PHP provides its foreach syntax where foreach($array as $item). Now if we print($item);--we'll discuss it in a second--that works perfectly fine. The way that foreach is working is the first argument is the array that you're iterating over. And the second argument, item, through each pass of the for loop it's going to take on the next thing in the array. So remember the array has an order. The first time through the for loop, item is going to be 123 then it will be 12 then it will be 13 then it will be 23 then it will be 213. Things get really weird when you do something like foreach. Let's see what happens because you should never do this. What if we unset($array[1]); That was probably expected. You're iterating over this array, and each time you're unsetting the first index. So for index 0, the first thing, item takes on value 0, so it's going to be 123. But inside of the for loop we unset index 1, so that means 12 is gone. So print . PHP_EOL. PHP_EOL is just newline, but it's technically more portable since newlines in Windows is different from newlines on Mac and UNIX. On Windows newline is \r\n, whereas everywhere else it tends just to be \n. PHP_EOL is configured so that it uses whatever the newline of your system is. So print that. Let's not print_r($array) at the end. I had no idea that that would be the behavior. Item still takes on the value 12 even though we unset 12 before we ever got to it from the array. Don't take my word on this, but it looks like foreach creates a copy of the array and then item takes on all values of that copy. So even if you modify the array inside the for loop, it won't care. Item will take on the original values. Let's try unsetting it. What if this is $array[1] = "hello"; Even though we put "hello" into the array, item never takes on that value. There's another syntax to foreach loops where you put 2 variables separated by an arrow. This first variable is going to be the key of that value, and this second variable is going to be the same exact item. This is uninteresting here, but if we go back to our original case of 'a' -> 1, 'b' -> 1, here if we just iterate for each array as item, item is going to be 1 every single time. But if we also want to know the key associated with that item then we do as $key -> $item. So now we can do print($key . ': ' . Now it's iterating over and printing each key and its associated value. An additional thing we can do in foreach loops is you might see this syntax. Ampersands before variable names tend to be how PHP does references. Where references are very similar to pointers, you do not have pointers, so you never deal with memory directly. But you do have references where 1 variable refers to the same thing as another variable. Inside of here let's do $item. Let's go back to 1, 10. Let's do $item++; That still exists in PHP. You can still do ++. php.test.php. I have to print it. print_r($array); We print 2, 11. If I had just done foreach($array as $item) then item will be the value 1 the first time through the loop. It will increment 1 to 2 and then we're done. So then it will go through the second pass of the loop and that item is 10. It increments item to 11, and then that's just thrown away. Then we print_r($array); and let's see that this is just 1, 10. So the increment we did was lost. But foreach($array as &$item) now this item is the same item as this right here. It's the same thing. So $item++ is modifying array 0. Basically, you can also do $k -> $item and you can do $array[$k]++; So another way of doing that, we are free to modify item, but that will not modify our original array. But if we use k, which is our key, then we can just index into our array using that key and increment that. This more directly modifies our original array. You can even do that if for some reason you wanted the ability to modify-- Actually, this is perfectly reasonable. You didn't want to have to write $array[$k]++, you just wanted to write $item++ but you still wanted to say if($k === 'a') then increment item and then print our array. So now what do we expect print_r to do? What values should be printed? [student] 2 and 10. >>[Bowden] Only if the key was 'a' do we actually print that. You probably very rarely, if ever, will need to define functions in PHP, but you might see something similar where you define a function like function whatever. Usually you would say ($foo, $bar) and then define it to be whatever. But if I do this, then that means that whatever calls whatever, whatever calls baz, so the first argument passed to baz can be changed. Let's do $foo++; and inside of here let's do baz($item); Now we are calling a function. The argument is taken by reference, which means that if we modify it we're modifying the thing that was passed in. And printing this we expect--unless I messed up syntax--we got 2, 11, so it was actually incremented. Notice we need references in 2 places. What if I did this? What does this mean? [student] It will change. >>Yeah. Item is just a copy of the value in the array. So item will change to 2, but the array['a'] will still be 1. Or what if I do this? Now item is sent as a copy to baz. So the copy of the argument will be incremented to 2, but item itself was never incremented to 2. And item is the same thing as array bracket whatever, so that array was never incremented. So both those places need it. PHP is usually pretty smart about this. You might think I want to pass by reference-- This was actually a question on one of the psets. It was a questions.txt thing where it said, Why might you want to pass this struct by reference? What was the answer to that? [student] So you don't have to copy something big. >>Yeah. A struct can be arbitrarily large, and when you pass the struct in as an argument it needs to copy that entire struct to pass it to the function, whereas if you just pass the struct by reference then it just needs to copy a 4-byte address as the argument to the function. PHP is a little bit smarter than that. If I have some function and I pass to it an array of 1,000 things, does that mean it's going to have to copy all 1,000 of those things to pass it into the function? It doesn't have to do that immediately. If inside of this function it never actually modifies foo, so if($foo === 'hello') return true.; Notice we never actually modified the argument inside of this function, which means that whatever was passed in as foo never needs to be copied because it's not modifying it. So the way PHP works is the arguments are always passed by reference until you actually try to modify it. Now if I say $foo++; it will now make a copy of the original foo and modify the copy. This saves some time. If you're never touching this massive array, you never actually modify it, it doesn't need to make the copy, whereas if we just put this ampersand that means it doesn't even copy it even if you do modify it. This behavior is called copy-on-write. You'll see it in other places, especially if you take an operating system course. Copy-on-write is a pretty usual pattern where you don't need to make a copy of something unless it's actually changing. Yeah. [student] What if you had the increment inside the test, so only 1 element out of 1,000 would need to be changed? I'm not sure. I think it would copy the entire thing, but it's possible it's smart enough that-- Actually, what I'm thinking is imagine we had an array that looks like this: $array2 = [ Then index 'a' is an array of [1 2 3 4], and index 'b' is an array of whatever. I need commas between all of those. Imagine there are commas. Then 'c' is the value 3. Okay. Now let's say we do $baz($array2); where baz does not take this by reference. So $foo['c']++; This is such an example where we are passing array2 as an argument and then it is modifying a specific index of the array by incrementing it. I honestly have no idea what PHP is going to do. It can easily make a copy of the entire thing, but if it's smart, it will make a copy of these keys where this will have its distinct value but this can still point to the same array 1,2,3,4 and this can still point to the same array. I'll iPad it. We pass in this array where this guy points to 3, this guy points to [1,2,3,4], this guy points to [34,...] Now that we're passing it in to baz, we are modifying this. If PHP is smart, it can just do-- We still had to copy some memory, but if there were these huge nested subarrays we didn't need to copy those. I don't know if that's what it does, but I can imagine it doing that. This is also a pretty big advantage of C over PHP. PHP makes life so much easier for a lot of things, but you kind of have absolutely no idea how well it will perform because I have no idea underneath the hood when it's making these copies of things, oh, is that going to be a constant time copy, is it just going to change 1 pointer, is it going to be a ridiculously difficult linear copy? What if it can't find space? Does it then need to run garbage collection to get some more space? And garbage collection can take arbitrarily long. In C you don't have to worry about these things. Every single line you write you can pretty much reason about how it's going to perform. Let's look back at these. How nice is it that you don't have to deal with hash functions, linked lists, or anything like that? Since working with hash tables is so easy now, here's a fun puzzle to work on. Open up a file called unique.php and in it write a PHP program (also known as a "script"). We tend to call them scripts if they're short things that you run at the command line. Basically, any language that you don't compile but you're going to run the executable at the command line, you can call that executable script. I could just as well write a C program that does this, but I don't call it a script since I first compile it and then run the binary. But this PHP program we're going to call a script. Or if we wrote it in Python or Perl or Node.js or any of those things, we'd call them all scripts because you run them at the command line but we don't compile them. We could do this pretty quickly. We aren't going to use argv. Let's just blow through this. Call it unique, write a program. You can assume that the input will contain one word per line. Actually, argv will be pretty trivial to use. unique.php. First thing first, we want to check if we have been passed 1 command-line argument. Just as you would expect argc and argv in C, we still have those in PHP. So if($argc !== 2) then I won't deal with printing a message or anything. I'll just exit, error code of 1. I could also return 1. Rarely in PHP are you at this state where we're at-- Usually you're in a function called by a function called by a function called by a function. And if something goes wrong and you just want to exit everything entirely, exit just ends the program. This also exists in C. If you're in a function in a function in a function in a function and you want to just kill the program, you can call exit and it will just exit. But in PHP it's even more rare that we are at this top level. Usually we're inside some sort of function, so we call exit so that we don't have to return up 1 thing which then realizes there's an error so that returns up if that recognizes there was an error. We don't want to deal with that, so exit(1); return(1); in this case would be equivalent. Then what we want to open we want to fopen. The arguments are going to look pretty similar. We want to fopen($argv[1], and we want to open it for reading. That returns a resource which we're going to call f. This looks pretty similar to how C does it except we don't have to say FILE*. Instead we just say $f. Okay. Actually, I think this even gives us a hint as to PHP function called file. PHP File. What this is going to do is read an entire file into an array. You don't even need to fopen it. It's going to do that for you. So $lines = file($argv[1]); Now all of the lines of the file are in lines. Now we want to sort the lines. How can we sort the lines? We sort the lines. And now we can print them or whatever. Probably the easiest way is foreach($lines as $line) echo $line; [student] Wouldn't we even cross lines by referencing something into sort? This is where sort would be defined as function sort(&$array). When you call the function you don't pass it by reference. It's the function that defines it as taking it as reference. This is actually exactly what went wrong when we put everything to our servers when we went from 5.3 to 5.4. Up until 5.4, this was perfectly reasonable. A function doesn't expect to take it as reference, but you can pass it as reference so if the function does happen to modify it, it's still modified. As of 5.4, you're not supposed to do this. So now the only way you pass by reference is if the function explicitly does it. If you don't want it to modify it, then you need to do $copy = $lines and pass copy. So now lines will be preserved and copy will be changed. php.unique.php. I might have messed something up. Unexpected 'sort'. There's going to be something that does this for us. It's not even there. Notice when you read the manual that the first argument is expected to be an array and it's taken by reference. Why is this complaining to me? Because I have this function sort still in here that I don't want. Okay, php.unique.php. I didn't pass it an argument because I don't have a file. It's php.unique.php on test.php. Here is test.php all printed out in a nice sorted order. Notice that sorted order is kind of weird for a code file because all of our blank lines are going to come first then are going to come all of our 1 level indentations then come all of our no indentations. Yeah. >>[student] So for the source code it wasn't passed by reference? Is that generally passed by value? [Bowden] When you call a function, it never determines whether it was passed by reference. It's the function definition which determines whether it was passed by reference. And looking at the function definition of sort or just looking at this, it takes the argument by reference. So regardless of whether you want it to take it by reference, it does take it by reference. It modifies the array in place. This is just not allowed. You're not allowed to do this. >>[student] Oh, okay. [Bowden] This, sort is going to take lines by reference and modify it. And again, if you didn't want it to do that, you could make a copy of sort. Even in this case, copy isn't actually a copy of lines. It just points to the same thing until it first gets modified, where it's first going to get modified in the sort function, where, because it's copy-on-write, now a copy of copy is going to be made. You can also do this. That's the other place you can see ampersand. You see it in foreach loops, you see it in function declarations, and you see it when just assigning variables. Now we have accomplished nothing by doing this because copy and lines are literally the same thing. You can use lines and copy interchangeably. You can do unset($copy); and that doesn't unset lines, you just lose your reference to the same thing. So as of this point, now lines is the only way you can access lines. Questions? Yeah. [student] Completely off topic, but you don't have to close PHP with the-- >>You do not. Okay. [Bowden] I would go as far as to say it's bad practice to close them. That's probably an exaggeration, especially in a script, but let's see what happens if I do this. That did nothing. What if I wanted-- [sighs] I need to pass an argument. Shoot. I called it wrong. So php.unique.php with an argument. Now I don't even need this. I'll pass it a valid argument. This printed whatever it's printing. I'm printing copy and copy doesn't exist. So lines. It printed everything, and then notice all this junk down here, because in PHP anything that is outside of PHP tags is just going to be printed literally. That's why HTML, it's so nice that I can do div blah, blah, blah class or whatever, blah, blah, blah and then do some PHP code and then do end div. And now printing this I get my nice div up top, everything that PHP printed, div at bottom. Disastrous when something like this happens, which is pretty common, just a stray newline at the bottom of file. You wouldn't think it would be that big of a deal until you consider the fact that with browsers-- How redirects work or basically any headers work, when you make your connection to a website and it sends back all these headers and things like response 200 or response redirect or whatever, headers are only valid until the first byte of data is sent. You can redirect thousands of times, but as soon as the first byte of data is sent you're not supposed to redirect again. If you have a stray newline at the bottom of a file and let's say that you use this function and then you want to-- Let's say it's another file that's index.php and you require_once something-- I can't think of a good example of it. The issue happens when this line at the bottom gets echoed. You don't want anything to have been echoed yet. Even though you didn't intend on anything getting echoed, something did get echoed and so now you're not supposed to send any more headers and you're going to get complaints. You just don't need those closing tags. If you plan on doing something with HTML-- and it's perfectly reasonable to do down here div whatever and then at this point you can or you cannot include them. It doesn't really matter. But in PHP scripts it's rare to close it. When everything is PHP, absolutely everything, you don't really need to close it/you shouldn't close it. Dealing with strings is much nicer than in C. In PHP you can specify a string with single or double quotes. With single quotes you can't use "escape" sequences. Constantly escape, blah, blah, blah. So printf is very rare in PHP. I guess I would use printf if I wanted to do a sort of thing--in pset 5 you used sprintf or whatever. But you want to do 001.jpg and 002.jpg. So for that sort of thing where I actually want to format the text I would use printf. But otherwise I would just use string concatenation. I never really use printf. We're just differentiating the details between single quotes and double quotes. The biggest difference is that single quotes, it will be printed literally. There is no char data type in PHP, unlike C, so this is equivalent to this. They're both strings. And the nice thing about single quote strings is I could say 'hello world!' blah, blah, blah, $$wooo. What happens when I print this is it will print it literally. Let's get rid of all of our stuff. So echo $str1; It literally printed all of those things: dollar signs, backslash n, which you would think would be newlines-- all of those things it prints literally. The only thing you need to escape are single quotes because otherwise it would think it's closing the single quotes. Double quotes, completely different. We already see the syntax highlighting is cluing us on to what's about to go terribly wrong. php.unique. Undefined variable: wooo because this is interpreted as a variable called wooo. Double quotes let you insert variables into-- Let's say $name = "Rob"; So echo "Hi, my name is $name!!"; It recognizes this as a variable. When I run that--and I will insert a newline--Hi, my name is Rob!! and hello world!! This is because I never removed the printing of wooo above. There is 1 further step you can do. $array = [1, 2, 3]; What if I want to print the first index of array? You do $array[0]. The syntax highlighting is a clue. What is this going to do? php.unique. Hi, my name is 1!! which is not what I wanted. Syntax highlighting lied to me. Let's try 'a' -> 1, 'b' -> 2. That's how I would have to write it. Unexpected single quote (T_ENCAPSED blah, blah, blah, blah, blah). The idea is that it's not recognizing this as part of the array. It's not recognizing this as array indexed by letter a. You want to do that surrounded by curly braces, and now whatever is in this curly brace will be interpolated, which is the word we use for magically inserting these variables into the right places. Now doing this, php.unique, and Hi, my name is 1!! as expected or Hi, my name is Rob!! One thing that's kind of nice about single quotes is that-- There's some cost to interpolating. If you use double quotes, the interpreter has to go over this string, making sure that, "Oh, here's a variable. Now I need to go get that variable and insert it here." Even if you don't use any variables, nothing inside of these double quotes needs to be interpolated, but it will still be slower because it needs to go over the double quotes looking for things that need to be interpolated. So single quotes can be a bit faster if nothing needs to be interpolated, and I tend to even use single quotes for, 'Hi, my name is ' . $array['a'] anyway. That's going to be equivalent to what we had before. But it's a matter of preference. If you're using PHP, you probably don't care about the speed difference. There isn't enough to reason them out to begin with. Any final questions? We actually didn't even get through all of it, but this stuff was boring. The last thing that's kind of nice in PHP is when you're dealing with HTML, you'll use it a bit, so the nice shortcut syntax for printing a variable. Without putting PHP here, this is called short tags. Officially as of PHP 5.4, this is deprecated. You are recommended to put php. This is still supported, so short tags with the