>> This is CS50. This is the start of Week 8 and oh my god, have we got a lot of fun in store for the next few weeks. So by the end of today, you will be able to make this ridiculously annoying Internet medium called Hamster Dance, that has, who's soundtrack is available on the course's website under, lectures. But first a few announcements today before we get there. So one of your classmates is in the process of recruiting folks for the Harvard Digital Media Group. This is a group of students who get together, eat pizza, talk with themselves and with professionals in the industry about social, about digital media. If you would like to partake go to cs50.net and there's a sign up link on the course's home page at the moment. Also coming up is Hack Harvard, sponsored by the I3 Competition and the Undergraduate Counsel. The UC has put this together toward an end of giving you guys an off ramp from a course like CS50, so that if you tackle your final project over the next couple of months and you feel like, wow I'd really like to take this to the next level, I'd really like this to be the next Harvard FML or I Saw You Harvard or Facebook, beyond Harvard's campus. This is an opportunity to work during J-term with some of your friends or on your own. Perhaps you have some support from the Undergraduate Counsel in the I3 Competition so that if you need a bit of cash, if you need a bit of space, if you need a bit of advice, you have resources available to you well beyond the end of the semester. This is distinct from the CS50 Hackathon, to be clear, which is coming up in December and really is just going to be an opportunity for us as a course to get together from 8pm to 6am and work on final projects if you so choose to partake. So that's what's on the horizon there. The final project then. So this is the specification. There really should be no surprised since we talked about this in the first week and all the same information is in the syllabus, but do use this as your authoritative resource over the next couple of months as we plan for the final project and in turn the Hackathon and the CS50 Fair. You might be wondering, we haven't even done web programming, we haven't even finished the course, how am I supposed to figure out a project to work on? Well again, the goal of having this in your hands today is just to get you thinking about it. Gets you to start looking around campus, looking around you in life for possible ideas and it's also a chance to strike up a partnership with one or two friends, since the spec does allow you to work in a small team if you would like. So do read that over the course of this week and take note of the milestones, including a pre-proposal, which is simply an e-mail to your teaching fellow in a couple of weeks, that's just meant to get the juices flowing, give you an opportunity for a bit of preliminary feedback on the ideas you might be thinking about. Also, as we enter these last several weeks of the course, the goal is to make sure that you guys have some real world skills so that you don't exit the course and then wonder, how in the world am I ever going to write another computer program if I don't have a cloud.cs50.net account? Well rest assured that most everything we've been using in the course is very much industry standard. You can download most of the software we've been using onto your own laptops even and as a step toward that, you'll see that problem set 6, the spell checker tries to give you some exposure to just using perhaps a more user friendly tool. Some of you might be familiar with TextWrangler, it's a free text program for the Mac. It's similar in spirit to Notepad and those sorts of things. You can download it per problem set 6's directions. But what's really nice, if you never noticed, is that under it's file menu it actually supports opening files from an SFTP server and recall, that's a protocol you used for problem set 5 to move bitmap and jpeg files back and forth. So if I want to for instance, start working on problem set 6 in my account, rather than have the file on my own laptop, I can go to, Open from SFTP server, I can type in cloud.cs50.net, I have to check the box for SFTP, lest it be insecure which won't work, my username and password. I then go ahead and click, Connect and what you see is a little graphical rendition of your home directory and I've got a bunch of junk in there, but right now I want pset6, maybe dictionary.c and now finally, all these weeks later, I have something that's a little more familiar, a little more versatile than something like Nano, so that you can now start work client side but still in a separate window by SSH-ing can you then run the familier commands like gdb and gcc and you can continue compiling your code in that familiar environment. But now you can do things like C syntax highlighted code client side, you can drag and cut and paste as usual and realize that though I'm using a Mac, there's a free program called Notepad++, which allows you to do something very similar in the PC world. So you don't have to do that, but do try to, do try your hand at that. So one comment then about this. So this isn't where I hang out at least, but this did come to our attention. An excerpt from Harvard FML. So I was in CS50 office hours for three hours last night and never got helped once, FML. So, I mean frankly, this is discouraging to see. Because this is certainly not the goal of having a team of 80 some odd teaching fellows in courses is to actually have you guys going unhelped in office hours. So realize this is a function of a couple of things. One, as many people as we are, we certainly have our own scheduling constraints that we do strive to spread out over the course of the week, but realize too that, if you are getting to office hours on Wednesday's and Thursday's in particular and you're finding that it's this crazy assembly line environment, realize that's kind of the nature of the beast and we do take care to have office hours earlier in the week. Fewer of them, on Monday's and Tuesday's and even Sunday's and do realize, if you're finding that you really need a more nurturing environment, frankly then this crazy Wednesday and Thursday nights environment allows for, do figure out a way to start the pset on Friday or on Sunday, so that you can take advantage of some greater attention that just by nature of the number of students on the class we can offer earlier in the week. But with that said, we recognize that it's one thing to say, please start earlier and it doesn't actually happen often. So we will also simultaneously adjust our hours to try to tail load things on Wednesday's and Thursday's to better handle the load. But do try to meet us halfway, since we are happy to work with you so individually. So, how do we get from nothing, an empty text file, to something like or hopefully frankly, something a little more compelling. So there's this thing called the Internet and today we start talking about it, the Web, but also other tools and languages. So we spent much of the semester focused on C, not because knowing C specifically is a lifelong skill, but rather that particular language we actually think does provide a very solid foundation for a lot of more common placed languages these days. I feel like I should probably move this off the screen, lest it become too mesmerizing. So we'll go back to this, which is very stimulating. So the goal really of the next couple of weeks is to help you realize that once you know one language like C, you can really boot strap yourself into other languages, other technologies. Again, as I said last week, I learned formerly C and C++ and a language called, LISP in CS50 and 51 and then that was it. Ever since then I've picked up things as I go. I ask friends lots of questions, you Google around, you read a book, there's so many ways to learn this stuff. So one of the take aways for this week and next is going to be that you can absolutely pick up new languages quickly, once you understand the underlying fundamentals. So even though this week for the pset you're implementing this thing called a, hash table, next week you're going to just get to use a hash table, with one line of code and this language called PHP and another called JavaScript. We'll just give you a hash table whenever you want it. You won't need to implement that yourself from scratch. You can build on the shoulders of others and on years worth of other technology. So how do we get there? Well we need the right tool for the task. So today we introduce web programming and for that we have this language, PHP. This other language called, HTML, the latter of which is not a programming language because it can't tell the computer to do something, but it's a markup language and that it can tell the computer how to display something. It's an aesthetic language of sorts. But we need some place to put the code we're going to start writing and up until now you've been using cloud.SC50.net, even though you've been SSH-ing to that server, it turns out that it's available via another protocol called, HTTP and this is something you've been probably desensitized to over the years. But in the URL, in almost ever URL that you visit in a browser is, even if you didn't type it, http:// or https:// and that's, Hypertext Transfer Protocol, and this is just a language It's a set of, it's a protocol. A set of conventions that web browsers and web servers use to speak to one another in order to let you request a web page and let you get back that web page. And even though we've focus today and on Wednesday on the languages involved in this process, realize that there's some really juicy material in there and if you're liking computer science, realize you can go off in really neat directions in hardware and networking and we'll really just scratch the surface of some of that this week. But we need an Internet We need a bunch of computers somehow connected together that will get data from point A to point B and though this video you're about to see takes a few liberties with let's say accuracy, it does none the less paint a reasonable picture of what's actually going on on the Internet when data travels from point A to point B. So I give you, Warriors of the Net. [ video ] [ music ] >> He came with a message. [ music ] With a protocol. [Inaudible]. [ music ] He came to a world of cruel firewalls, uncaring routers and dangers far worse than death. He's fast, he's strong, he's TCP/IP and he's got your address. Warriors of the Net. >> So that's just the teaser trailer for what's actually a longer video that discusses how the Internet works. But for today, we're just going to take it at face value. That that is how the Internet works. All right? [ laughter ] So there is this, this protocol, this language called, TCP/IP, and this is again just a sort of standards that computers speak. Client computer like your laptops, server computers like the cloud, speak in order to move data from point A to point B. And what's neat about the Internet and networks in general is that there is years worth of interesting layering going on. We won't spend time on this in this class, but in CS143: Networks, we do spend more time on this. There's this layer called Ethernet. You're all probably familiar with the idea of an Ethernet cable, even if you don't use them anymore and that's pretty low level technology. Because it pretty much boils down to electricity flowing through a wire, actually transmitting what represent 0's and 1's and just to flash back to Week 0 and 1, well how do you actually send bits on a wire, bits on a network? You know is simple form you can imagine turning electricity on. So electrons are flowing to represent a 1 and then turning it off, now you've got a 0. It's more sophisticated than that, but at least you can code 0's and 1's using electrical signal. So we'll take that for granted. Then on top of Ethernet you have something called IP, Internet Protocol, and this is a protocol, a set of standards, that says every machine on the Internet needs an unique address. A numeric address, generally known as an IP address. And this is just a number that's something of the form number.number.number.number. If you maintained your home router at home it probably looked a little something like this, because there are constraints on what numbers can be there. As an aside, each of those hash symbols represents an 9 bit value. So that means each hash symbol is 0 through 255, with a few more constraints. The implication is that if you have 8 bits plus 8 bits plus 8 bits plus 8 bits, that's 32, which gives us how many possible IP addresses? [ laughter ] It gives us 32 bits worth of address space, that means it gives us 4 billion. Right? 2 to the 32 is 4 billion possible IP addresses and scary thought, the world is running out. So there's a lot of doomsday sayers saying how things will start breaking down before long because we are running out of IP addresses, but thankfully there's a version 6 of IP which uses 128 bit addresses, which will hopefully save the day when people start using it. But for now, all we care about is that there's Ethernet which gets data physically from A to B, there's IP which kind of assigns, which does assign an address to every computer on the Internet, much like the US Postal Service puts a unique address on your mailbox so mail can get from point A to point B, and then there's other mechanisms on top. One of them is called TCP. Hence the TCP IP protocol. TCP is just a protocol, a set of conventions that says, not only do I want data to flow from point A to point B, I want to make sure that data gets from point A to point B, so TCP's purpose in life is to keep track of all the data that flows across the Internet between points A and B and if any packets get lost or corrupted or dropped by a firewall, if they just disappear somewhere into the Ether so to speak, TCP's purpose in life is to tell my computer A, resend that same data to point B. And so this is why you either get an e-mail or you don't. You either download a file or you don't or at least part of a file. You don't necessarily get bits and pieces of an e-mail. It's mostly an all or nothing thing and that's largely the result of TCP. But there are other protocols that you can actually use. So we have Ethernet, we have IP, we have TCP and then finally we have HTTP. And so this layering is deliberate. The people whom invented HTTP, just wanted to assume that all this stuff on the bottom actually worked and got data from point A to point B, so we'll be focusing this week and next on using this highest level, HTTP and in turn this language called HTML. But just so you have a sense of what's really going on underneath the hood, I've gone ahead here and I've SSH-ed to a server. Let me go ahead and log back in and it turns out on a lot of Linux and Unix systems, there's a command called, Trace Route, which will trace the route between points A and B. What's actually pretty neat about this is that it helps us see if for diagnostic or pedagogical [assumed spelling] purposes, exactly what's going on between points A and B. So I do not obviously have a connection physically to say Stanford University. It's all the way over on the west coast. But there is this Internet and the Internet is composed of all of these things called routers, which are fancy computers that sit in data centers whose purpose in life it to take data in here, look at that data, look specifically at the IP address and realize oh, this IP address lives on the west coast, so a router then routes that packet that way, to the west coast and it might then reach another router who makes the same decision. So routers is kind of depicted by that little video there, move data in different directions. Left, right, top, down, depending on where they're destined geographically. So you can actually see these things called routers, not when you send an e-mail or pull up a web page, but if you do a bit of poking around. So I'm just at a command prompt here on a Linux system, that's owned by the Harvard Computer Society, one of the student groups on campus. I'm using them because most of FAS's systems don't let you do this, but their's do, which is a useful trick for this class. So I'm going to go ahead and run Trace Route, www.Stanford.edu and Enter and I see a whole bunch of output. Let me actually shrink this and rerun it. A whole bunch of output whereby every row in this output is numbered and this first row, 1, represents the first router on the Internet that my data, point A, reaches on it's way to point B, which is Stanford. So it looks like this first router doesn't have a name, because all I see are numbers in row 1 here. But it has an IP address 140.247., 140.247.89.130, that belongs presumably to Harvard. Because Harvard owns thousands of IP addresses, all of which start with 140.247. Now how long did it take to get to that router, which is probably in the science center or somewhere proximal? Looks like it took about 2 or 3 milliseconds and the fact that you see three numbers there, just means this program tried three times just to give me a sense of an average, in case there's an anomaly. Where does it go next? Well apparently this router, wherever it is, is connected to CoreSC1GWDL412, whatever that is. Well GW is shorthand notation for Gateway, which is synonymous with router, so it looks like Harvard has a second router, somewhere in the SC, Science Center, to which that router has some kind of connection. The next router is 128.103. something, that's also a prefix that Harvard owns. Any IP's in that range belong to Harvard. It doesn't have a name because the humans decided not to name it, but that's fine. But then we get to row 4, bdrgw, feels like Border Gateway, they're just being a little succinct with the names here. Gateway again is a router, Border means it's on the periphery of Harvard's campus and sure enough, it's still in harvard.edu. But then it goes to row 5. nox300gw and this is actually something called Northern Crossroads, which is a super big data center in the northeast, where lots of Internet traffic comes into and then goes out of. So Harvard has a physical connection to this Northern Crossroads set up. Those guys in turn have a whole bunch of routers inside their network. But this is kind of interesting. Once we get from row 7 to row 8, where do we seem to be ending up? It looks like Kansas. So humans who run servers tend to name routers based on the geography or the nearest airport code, so KANS, I'm going to suppose is like you, Kansas. The next one is probably Houston. The other one, LOS, Los Angeles probably. And then after that, yes, LAX, denotes Los Angeles. So what's happened, which is really neat here, between rows let's say 7, which we know to be in the northeast and definitely row 10 or 11 where we're in LA, there is some kind of connection. Some really long wire between those two routers. Because notice how long it takes to get to row 11 here, 80 milliseconds, which is up from 2 milliseconds when it was actually on Harvard's campus. So you can infer from the amount of time it's taking for the data to go from A to B, roughly how far these geographies are apart, assuming you're on super fast connections. These asterisks here just mean the router is in between point A and B, stopped paying attention to us. They started ignoring us at one point and that's generally for privacy or security reasons. But let's try another one. Let's go ahead and do Trace Route gmail.com. When you send an e-mail to someone at Gmail, looks like it's pretty close. Only 10 hops away, ten routers and it's a little harder to infer where this, oh LGA. Looks like Gmail has one or more routers, Google has one or more routers in New York. So it looks like Google has a data center somewhere on the east coast in New York, which makes intuitive sense and we get there in only 7 milliseconds. So that's pretty fast. What about CNN? Trace Route cnn.com. All right, here too we seem to be jumping around states. We're going from Harvard, to Boston, to New York, to DC, to Atlanta, to who knows where, but somewhere in there there is CNN's actual servers. But let's do one other trick, CNN.co.jp, which is the domain name for the Japanese version of CNN's website and what's kind of neat here if it cooperates, is we again go from Harvard, to Boston, to New York, New York and then wow, notice what's between rows 10 and 14. An ocean frankly. Right? So. [ laughter ] Right. We go from 7 milliseconds to 226 milliseconds and in fact, there is in fact Tokyo is embedded here in the domain name. So it looks like we're probably going over the Pacific Ocean, assuming that's what these stars are kind of hiding from us. Because I see no mention of Europe. Maybe if you go the long way, but in fact it might have hopped off the west coast. And you can play with this all day long. The routes can certainly change. This is what's nice about the Internet It was originally designed with militaristic goals in mind so that it's supposed to be resilient against failures. If you take out one or more of these routers, hopefully it's a nice mesh, as that video briefly showed, so that data can go off in different directions and this name, the World Wide Web, derives from this idea that there isn't just one connection between A and B, there is this web of connections that's fairly resilient these days, even when servers go down. So that's as much time as we'll spend on the underlying implementation of the Internet Henceforth we will now take for granted that there is this thing called an Internet So how do we actually go about using it? Well, when you pull up a Web browser, you are using a client, so this thing on the left, and a client is just the name given to any machine that's frankly not a server. The client, much like in a restaurant, request information of the server, the waiter or waitress, responds with that information or with that food. So it's the same kind of relationship. So here we just have a quick summary of what's going on when a web's browser connects to a web server and you can talk about machines being clients, my laptop is client, but you can also talk about software on a machine being a client. So a web browser is a client. It really reduces not to the physical thing but to what role they play in some two way relationship. So a client requests data of a server. What does it mean to request data? Well even though a normal person like us just goes to like cnn.com and then hits Enter. Well there's a little more technicality going on underneath the hood. What's really going on is that your browser is sending a message to cnn.com or rather to cnn.com's IP address, which you can figure out by asking something called a DNS server. So quick aside, there's these things called DNS, Domain Name System server, whose purpose in life is just to answer queries of the form, what is the IP address for cnn.com or here's an IP address, what is it's domain name? So it remaps those two things to one another. So once my browser knows the IP address, it sends a message frankly, as simple as this, get/ and then the language with which it wants to get it. So generally HTTP 1.1. What does, slash, denote? Well as you probably noticed in an URL, cnn.com almost always gets changed by your browser to cnn.com/ and if you go to cs50.net, it changes to cs50.net/, because / like on a local hard drive, generally denotes the route of the file system, much like on a Linux system it's similar in spirit to C:\ on a Window's PC. So that's the message that this client here on the left hand side, sends to the server. The server then responds with a whole bunch of text, with a text file containing a language called HTML. So HTTP is the protocol. The set of standards that says when you want to request a web page send a message like this from client to server. HTML is the language that just so happens to be embedded in the response. But as you know from using the Web, embedded in the response can be a graphic, a sound, a video. So HTTP doesn't care about what content's coming back, just how to request it and how to hand it back and we'll see that in more detail in just a moment. But let's see what kind of stuff is coming back. Well let's go to the Crimson.com and not focus so much on the content on the page, but the content underneath the hood. So I just went to Safari's view menu. You can do this with any browser and I'm going to scroll passed this fast, because the specifics aren't that interesting. But this gibberish or this Greek or however you want to view it, is something called HTML and in fact, it looks like there's some links on the Crimson's website to categories like, food and drink, football, for the moment, game recaps and they're all embedded in what looks like a fairly consistent structure. You've got an angled bracket, like a sideways triangle, the letter A, then the word class equals something, the href, turns out href is going to mean hyperreference and then a short URL. It's not a full URL because there's no mention of HTTP but the fact that these H ref's start with a slash here, just means that these URL's are relative to wherever the user currently is on the current server. Doesn't have to bounce the user to another server. We can do this for any website. If you go to cs50.net, oops, and view source, you'll see that too, our page is composed with the same kind of text. And this is the neat thing. It's a little scary at first perhaps, but the neat thing about the web is that it's so easy to learn new tricks and learn how other sites are implemented, because you can just look at their source code. And generally speaking, people don't consider HTML as intellectual content. Right? There's really not much intellectual ownership in the HTML you're marking up. There might be copyright of the data, but it's very much an environment where you can learn from the work of others and frankly, this is sort of the nature of the technology. If you are requesting data, as a web browser, for information from a server, the server obviously has to hand it to you so that you know actually what to show the user on the screen. Unlike programs written in C, where you have this opportunity to compile them into 0's and 1's, HTML is what we'll call and interpreted language, where it does not get turned into 0's and 1's from the server, rather it just gets sent raw. So when you're writing HTML source code and then request your web page from a browser to a server, what you get back is your HTML source code and that's going to be true for JavaScript and it's kind of going to be true for PHP, although thankfully with PHP, that language we'll see it executed on the server. It's not converted into 0's and 1's per say, but it is analyzed and executed on the server so that what the user sees is just some boring HTML output, not the intellectual property that is my PHP code. So in short for now, HTML is a mark up language. It's an aesthetic language that says how to display things on the screen and we'll start doing this ourselves in a moment, PHP is a true programming language, with four loops and wild loops and functions and variables and all of that with which you can implement logic and actually take input and produce output from a user. So let's actually see what's going on when I visit a site like, oh we can do the Hamster Dance. Let's go to, let me go ahead and open up Firefox. So even though the course doesn't really care what browser you like to use, you can use any one you want, Firefox frankly has a really nice set of development tools. So what you'll see in Problem Set 7 later this week, is that we recommend that you use some freely available tools. Because it really makes it easier, if not a little more fun, to develop websites. Because you can see a lot more of what's going on underneath the hood, specifically, I've installed something called Firebug, which literally puts a bug in the bottom right hand corner of my screen and then if I click it I get this menu that actually allows me to look at what's going on underneath the hood with various websites. Without having to look at the mess that I just showed you, the raw source code. So let's go ahead to I think it was called WebHamster.com? Yes, that's it. Let me pull up Firebug and it turns out, this web page is actually pretty simple. I'll show you the raw source code. This is sort of like websites 1990's style. It's pretty darn simple, even though you might not understand all the syntax yet, but if I go over to this tab here, the Net tab, notice what's about to happen. Right now it's blank. I'm going to reload the page and in a moment, I'm going to see, line by line, a referenced, a record of every file that's requested by an HTTP. So in other words, I can see what I predicted verbally, goes on any time you visit a web page. So here we go. Reload. And it looks like, sure enough, to download this web page, it was necessary to make six HTTP requests. Each of those requests is from browser to server, asking the server for a different file. Let's look at the very first one. It looks like the very first one is the original one that I typed, get the whole URL, WebHamster.com and now notice what comes back is this. Let me actually zoom out a little bit. Let me go ahead and show you this. View, got to click a little, tiny link here, view source. So it's a little small, so I'm going to zoom in. I kind of lied a moment ago when I said all the browser sends is get/http/version number, there's a few more details, but most of them are uninteresting in my defense. And so this first line is literally what my browser sent to the server WebHamster.com, after figuring out it's IP address and that is just shorthand notation for give me the default web page, give me the so-called homepage that lives by convention at just slash. But the browser did send some additional information that I, the human did not provide, this information sent any time you visit a web page Specifically the web browsers reminds the server what host name or what domain name the user actually typed into the browser. This is useful as an aside, because in the world, in the real world you can buy or you can pay, you can rent web hosting space. Storage space. Like a cloud account for websites. But they're on shared servers. Where a company might have ten or a hundred other customers, all with their own domain names. Thankfully anytime a browser requests a web page, it's supposed to tell the server, I want slash, but specifically for this web site, WebHamster.com and this way the server can distinguish my requests from another customer's request, from another customer's request. That's called shared web posting, which might pertain to some of your final projects if you guys decide to try to commercialize thereafter or just make them live on passed the course. Then there's this stuff, user agent, you might have heard that most websites know what browser you're using, what operating system you're using, what plug ins you have installed, that's because your browser is very willingly giving up that information every time you request a web page. Most any web site can tell you what version of Mac OS or Windows you're running, what browser you're running, as well as you're IP address, obviously, otherwise the data couldn't get back to you. So there's a lot of juicy information websites know about you. Some of this stuff is technical and not all that interesting. But it relates to how the page is encoded, whether or not it's compressed and a few other details. But what comes back ultimately, is the response. So if I actually scroll back up here and look at this response tab, you see exactly the source code that came back. So when you request an URL, what you're really getting back is a file that might be a gif, it might be a jpeg or it might be a text file who's file extension is usually by convention, .html and inside that text file is just a bunch of stuff like this. But it follows a common structure. Let me look at the very top of the server's response and we'll see that almost every website out there does in fact start with an open bracket, html closed bracket, then inside of that or next to that is open bracket head, for the head of the web page, followed by open bracket title and then the actual title and then this curiosity. It turns out that html is very panantic [assumed spelling]. When you tell the browser, start doing something, you almost always have to tell the browser later, stop doing something. So if I reinterpret what I'm seeing here, this open bracket html, this pretty much is the html telling the browser, here comes some html. Get ready to display it in the familiar way. Open bracket head closed bracket says to the browser, here comes the head of the web page, get ready to display the title and some other stuff. Here comes the title with open bracket title closed bracket, the title is to be Hamster Dance and that's the word, those are the words that you see at the very top of the window. But then when you have a tag, as these things are called, anything between brackets, angled brackets, is called a tag. Anytime that tag starts with a forward slash and then ends with the same word that the previous tag started with, that's kind of like the opposite. It says, stop displaying the title. So this is start the title, stop the title and so accordingly, this is generally called a start tag, the title tag, this is called an end tag or an open tag and a closed tag. So long as you have some notion of symmetry doesn't really matter what you call them. Then we have this thing. The body tag. So there's mainly two big parts to a web page. There's the head in which very little tends to go, like the title and then we'll see some stuff called cascading style sheets and some other stuff called JavaScript, but for the most part the head is very small. In the web page it's the body where is really 99% of the content usually, so the body tag says, here comes the body of the web page But notice this interesting thing. It turns out that tags, kind of like functions in C, can be parameterized. You can modify the default behavior of their, you can modify their default behavior by passing in something similar in spirit to argument or parameters, but in this world they're called attributes and any attribute is a key word, like bgcolor = quote unquote and then some value. So now you can probably guess, bgcolor is, background color. Why? Just some people, ten to fifteen years ago, decided this is what we'll denote the background color of a page, bgcolor for short and then there's this quote unquote. This is the value. Turns out you could have write hard coded words like quote unquote white, quote unquote black, quote unquote red. There's a few colors that are just so common that every browser knows what they are. But if you want to be a little more sophisticated, you can actually use hexadecimal notation and FFFFFF happens to represent, so white, that's a lot of red, that's a lot of green, that's a lot of blue, which together, like the spectra of light, gives you the white color. And if by contrast you did 000000, you'd get black and just like pset5, if you did FF0000, you would get a lot of red. So you can very quickly make some pretty hideous web pages but you can use these so called hexadecimal codes to be precise as to what they look like. So I'm not going to focus on too much more of this Web Hamster, because they're actually using a slightly older syntax. We're going to be using in the course or at least promoting in the course, the latest and greatest version of HTML, which is called HTML5. This is what Steve Jobs and others have kind of been stomping their foot about people using. The nice thing about HTML5 is it's a little simpler than HTML4, which is sort of last year's version. But I should disclaim that HTML5 as a language, isn't quite finalized. The world tends to take like five years before they ever decide on the new version of anything in the computer world. But if you've ever heard that iPhone's don't display Flash video but they do display HTML5, what that means is that websites that support, like the iPhone and these other devices have been designed with a certain version of the language. But the nice thing is it's by no means Apple specific, this language, HTML5, it just happens to be Apple decided. We're going to support this one. And it is to be sort of the next iteration. So we will teach you the latest and greatest. And with it you can actually do some neat things. So even though we're talking for the moment about a web browser, so this here, not to sound like an Apple fan boy here, this is an iPad, but using some fairly simple HTML and as we'll see next week, a language called JavaScript. This is not a native application. This isn't something I downloaded from the App Store, it's just a website that I'm visiting with Safari. And if my network cooperates and I hit Play here, what we did is we went ahead and implemented the same CS50, come on, the same CS50 video interface, come on, don't embarrass me. Try one more reload. Network is a little spotty. Come on. Play. Come on. There we go. OK, so if you've ever used CS50's videos that have the slide synchronized transcripts to the side, well we package this all up using the same language, HTML5. This other language for next week, namely. >> This is CS50. My name is David Malan. >> So that was some eight weeks ago. So in short, even though we'll focus conversationally mostly on web browsers, realize that these exact same technologies now are being used to design sites for Android, for iPhones, for Blackberry, this again is very distracting. Sorry about that. And so realize that what's really cool about the Web these days is just how universal it is. You don't have to worry about compiling your code for a Mac or for a PC, we're really now able to develop applications that are pretty much independent of the operating system and the hardware that the user actually has, which is pretty darn exciting. So let's make our simplest of web pages in this language called HTML. So we have set up the cloud in such a way that using your same current cloud accounts, you can start making websites and you can develop problems that seven and eighth year and also your final projects and know now too, if you would like to go out and get for $10 or $20, sort of a fun domain name like, isawyouharvard.com or harvardfml.com, you can buy these things fairly inexpensively these days on a yearly basis. Realize that for final projects, we'll let you and we'll show you how to map that domain name to your cloud account so that when people visit mywebsite.com, it's actually being hosted on the cloud, even though the world knows your website under whatever favorite name you've chosen. The upside of this is that if you decide to continue your projects after the course or commercialize them, you can maintain that same branding and just move the website off of the cloud eventually to your own commercial web host and we'll show you how to do all of that. It's actually relatively easy. So I'm going to go ahead and do this. It turns out that on typically a Linux system, you can create a special directory. A directory who's name is public_html and the server is usually set up in a way that anything inside of that folder is accessible on the World Wide Web. So I'm going to go ahead and do mkdirpublic_html and next week's pset, Problem Set 7, we'll walk you through these steps, so today we'll go quickly. I'm going to go ahead and hit Enter. That will create for me a public_html directory, which I can then cd into, change directory, and now I'm in my public_html directory. There might be a few other commands you'll have to write and we'll walk you through those in the problem set, but now I have some storage space for my personal web page. So I'm going to go ahead and make my first web page or I'm going to go ahead and open Nano. But you could do this in TextWrangler, but just so that we're not Mac biased, I'll go ahead and use Nano for today. So I'm going to go ahead and create a file called, Hello.html, .html being the world's convention for the file extension for web pages and I'm going to start the page as follows, doctypehtml. So this is kind of a stupid habit you should get into which just tells the browser, this is the version of html that I'm using. In this case here, the fact that I've not mentioned a version number, it's going to imply to the server or to the browser, here comes some HTML5. But now that doesn't mean here comes a web page, even though the syntax looks similar, humans just did not make very good choices here, to tell the browser here comes a web page, I need to actually say, html and then I'm going to get a little ahead of myself and I'm going to preemptively close that tag. So this web page right now, completely uninteresting. Because it's saying, here comes a web page, there goes the web page. There's no content actually there but it's a good habit to get into, just like you might put a curly brace open and a close curly brace in your C code and then go back and fill in the blank, same idea here. So let me go ahead and set up the head of this web page. I'm going to similarly preemptively open a tag and close it. Again just like in C, it's good habit, but not technologically required to have white space and indentation, but do do it for the sake of style. I'm going to make a web page that says, hello world. And now I'm done with my title, so I kind of do the opposite. Open bracket slash title, with no spaces in there. It's all one big string. All right, that's the head of my web page. Now let me go ahead and make the body. Let me close the body and now inside of the body I'm just going to say again, hello world welcome to my homepage. Just something silly like that and that's it. That is all that's required to make a web page on the World Wide Web. And to be honest, it kind of testified to this, first learned HTML in like 1995, 96, I was in Math 20. My math CA who I got friendly with, we went to the basement of the Science Center one day. He was like hey, you want to learn. [ laughter ] [ applause ] Can I finish the story? [ laughter ] Can we go to the computer lab in the basement of the Science Center and he's like, hey you want to learn HTML? I'm like sure, I'd love to learn HTML. And so we logged into one of the. [ laughter ] We logged into one of the lab computers and no joke, like 20 minutes later I had the most horrific looking website on the Internet But that was testament to just how easy this stuff actually was. So with that said, let's actually recreate what my math CA and I did that day, here on stage. [ laughter ] I'm going to go ahead and save this file. So Control O or rather, Control X. So let's get it back to the command line. I've got, hello.html. If I do LS, notice that I do have a file, hello.html, I also preemptively put a lectures directory there, but never mind that. So I'm now going to go to my web browser and I'm going to open a new page and I'm going to go to http://cloud.cs50.net and this is slightly stupid convention syntactically, but the world decided that ~username, would represent username's homepage on this particular server. So my username here today is cs50. So I'm going to go to http://cloud.cs50.net/~cs50/ and then hit Enter and unfortunately, I get this weird looking listing. Well this is just the contents of that directory. Turns out that I made this directory world readable, which probably wasn't smart. Because now the whole world can see whatever files I happen to have in this directory. Maybe not a big deal because you shouldn't be putting private files in a public directory, but this is not the behavior I want. But wait a minute, I have to remember, the file I created was, hello.html, so we need to include that in the URL. So I'm going to go to /hello.html, Enter. Now, problem number two. So you don't have permission to access, hello.html, on this server. And notice some other curiosities. I'm being reminded of the name of the server. Statement of the obvious. But port 80. You actually might have seen this number in that little Warriors of the Net video a moment ago. Turns out that there are of course all these services on the Internet The Web, e-mail, instant messaging, VOIP services, all these fun applications that run on top of the Internet and I'm consciously putting my hand up here to distinguish it from Ethernet and IP and TCP and all that on top of the Internet, run all of these services and the only way a computer knows which service you actually want is that even though we didn't see it yet, any time a client connects to a server, it doesn't just connect to that IP address, it connects to a specific number. Something called a port number that's just a little hint that says, not only do I want to connect to the cloud.cs50.net or to cnn.com or thecrimson.com, I specifically want the service that lives at port number 80 and port 80 by convention is the Web. So 80 equals HTTP. SSH you might have noticed, is what port number? Twenty two. So here already, we see how a server like the cloud can have a web server on it running on port 80 and an SSH server running on it on port 22, because clients like Putty and Terminal and now the Firefox and Safari, know to send these additional numbers, 80 for Web, 22 for SSH, now the server knows when it gets a request, oh this should be handled by the web server, oh this should be handled by the SSH server. It's in this way that servers multiplex along multiple requests that might be coming in on the same IP address. So that's why we see 80 there. Thankfully the world decided that in a web browser, if you don't mention port 80, it's just assumed. Otherwise our URL's would be unnecessarily ugly and a little longer, but I can be pedantic [assumed spelling] here and I can say after.net:80 and then hit Enter, and that's actually going to lead to the same place. It quickly disappears because it's inferred, but it is in fact the same. All right, still got a problem. I'm really just stalling because I'm not sure what to do, but wait a minute, I need to go back over to the server and realize it's not enough to just say this folder is called, public_html, I actually have to make sure that the files within are world readable. By default, I think we said in one of the earliest lectures, when you create a file, it's owned by only you. It's readable only by you and this is a good thing, just for privacy sake. You don't want to create some file and then have anyone on the server be able to access it. So here we see, recall from an earlier pset, is that output of LS-L and notice on the left hand side here, RW means read write, because it's RW all the way on the left, that means only the owner, me, can read and write this file. So I actually need to run a command and there's a few commands I can run, but generally they involve chmod, C H M O D, for change mode and then I'm going to go ahead and say, A+R, this means all, the whole world, plus R, read, let the whole world read the file called, hello.html. I hit Enter. I then redo LS-L and now notice, sure enough, I get read write access and then everyone else in the world gets R, for read. So now I'm going to go back to the website, I'm going to go ahead and click reload and wha-la [assumed spelling], my very first web page on the Internet Well let's see quickly we can butcher this and make it my first ugliest web page on the Internet Well that's just some text, but notice at the very top. Here's the title. So when I had the title tagged, inside my page, that's where it ends up going, above the address bar. Let's use bgcolor, this thing called an attribute. So I'm going to re-open, hello.html. I'm going to add some space there and say, bgcolor equals quote unquote, let's say white. Save. OK. Kind of pointless. All right, so let's go back. Let's say, black. Save with Control O. OK, pointless for a different reason. Right? So you know, highlight the text, but now I have a secret web page All right. [ laughter ] So, now I can do something like this. Even though we saw all capital F's before, doesn't matter, just be consistent. All lowercase or all capitals. This should give me what color? All right, so this is red. So again, even though in the bit mapped world and Microsoft decided that an RGB triple would actually be implemented as BGR for really no good compelling reason, in the real world when you say RGB, you mean R G B. So FF is R, 000 is GB. So let's reload. All right, it's getting pretty hideous. Let's go ahead and try something else. Let's say I want to make something bigger. So it turns out I can use special tags called, h1. So h1 stands for heading, 1 means the biggest heading. I have to close it on the other side to be symmetric. I'm going to save this. And so now the text is going to stay the same, but the browsers that support this language called HTML, have been designed by their programmers to understand that open bracket h1 means start making the following text big and bold usually. So if I reload the page now, in fact it is bigger and bolder. If I want to put a second line of text, let me go like this, h5. h5 counter intuitively is smaller, so I'm going to say now, Goodbye World! Close h5. Save that. Reload here. And sure enough, it's smaller. So I really haven't done anything terribly sexy, but using now just this idea of a tag open and closed and an attribute, can we start to now control the construct of a web page. So now why don't we go ahead and take a five minute break. So before we continue our look at web programming, this video is actually apropo [assumed spelling] to the problem set that you just finished or are still finishing up. This is a video by some researchers who presented novel techniques for resizing images. But resizing images intelligently whereby you don't just take an image that's too big and then crop it by cutting out uninteresting parts of the photo over here and you don't just scale it down thereby shrinking everything. You try to remove some useless information like extra sky or extra grass, that in terms of information doesn't necessarily convey more meaning, but doing this with a computer and doing this dynamically with an arbitrary photograph, is non trivial. You can certainly do it with Photo Shop and you can do it by hand, touching things up, air-brushing things, but there are algorithms that they presented in a very well known research conference is about doing this dynamically and actually reducing images by throwing away content that again, does not contribute much. So it's just about 4 minutes and it gets increasingly neat, I think, some of the demonstrations that they show. So one of your classmates passed this along to us. [ video ] >> Seam carving for content-aware image resizing. Digital media today has the ability to support dynamic page layouts. By changing the window or display size, we can change the layout of a document. However, images embedded in a document, typically remain rigid in size and shape. Resizing important to fit images into different displays. Current techniques including cropping or scaling are limited in their abilities to capture the image content. We present a method for content aware resizing of images. Our technique enables resizing while adapting the image content and layout. This is sometimes called, retargeting. We also define the flexible multi size representation for images that supports continuous resizing. An image can be shrunk in a non uniform manner while preserving it's features, but it can also be extended beyond it's original size. Instead of scrolling through images that do not fit in a given display, we gracefully resize them to fit inside the window. For example, assume that we need to change the width of an image. The simplest way to do this is to remove columns of pixels from the image. The best column to remove would be the least noticeable or least important column. We can search for this column by defining and importance or energy function on the image. In this example we used the gradient magnitude of the image. Next we look for the column which contains the least energy and remove it. However, using such an approach quickly leads to serious artifacts. Therefore instead of using rigid column, we search for connected paths of pixels or seams, from one side of the image to the other that contain the least energy. This can be done using a simply dynamic programming algorithm as described in the paper in both vertical and horizontal directions. Here is another example of an image, it's energy function and the least noticeable horizontal and vertical seams. By successively removing horizontal and vertical seams, the image can be resized in a non uniform manner. The order of seam removal in an image defines an order on the pixels of the image, by storing the simple indexing information we create content to where multi size images. In this image we color all pixels in order of their seam removal from blue to red. To enlarge an image we first calculate seams as if we were to shrink the image. Instead of removing these seams, we insert new seams in these locations. The pixels of the new seams are the average of the pixels alongside of the new seam positions. This enables a smooth transition between reducing and enlarging the image size in multi size images. We have tested different possible energy functions for retargetting such as entropy [assumed spelling], saliency [assumed spelling], histogram of grading direction and eye movement measurements. The results depend on the given image, but simple gradient magnitude often gives satisfactory results. For certain contents such as faces, where the relations between features are important [inaudible]. [ laughter ] A simple user interface enabling the designer to protect these image areas. The application is also used as an authoring tool for creating multi sized images. Note that in this specific example, automatic face detection can be used to identify the areas that need protection. To illustrate another application for this tool, we show a simple object removal procedure, by adding negative weights to the energy of an image the user can attract the seams to pass through specific areas first. By reducing the size of the image. [ laughter ] These areas are removed first, in effect, erasing them. To retain it's original size, the image is enlarged by using seam insertion. Note that this technique changes the whole image and does not simple erase the objects marked. [ laughter ] More examples can be found in the paper and the supplemental materials. [ laughter ] Thank you for your attention. [ applause ] >> So it really is cool what's on the horizon. A friend of mine is a photography expert and he was recently called in by some lawyers to testify in a case which involved answering the question, was this video tape doctored or not. I'd argue that it's probably going to become increasingly difficult over the years to actually distinguish photographs and videos that were actually take of reality and ones that were in fact air brushed or something much more sophisticated. So some neat technologies are on the horizon. So with that said, we're going to deal with images more akin to this thing today. Perhaps a little, oops a little simpler. We have Hamster, so these are. [ music ] Those are images called jifs, jifs you might recall from problem set 5 are one of the things we asked you about and these are, if you're wondering what it means to be animated, animate jiff's and essentially an animated jiff is kind of like a mini movie where it's just multiple images all tacked inside of the same file and they just play according to some schedule in looping fashion. And here too there's an audio file called a wav file, W A V, that is simply a song that's played on loop on this particular website. So how do you actually include these kinds of images though? Well it's just HTML. Again, just spoil what's underneath the hood here, if I do look at the raw source, turns out that in addition to the body tag, the head tag, the title tag, there's also an image tag. Img. Now what you'll see is even though people will say image or longer words, generally a lot of these tags are abbreviated just for efficiency sake. So let's go ahead use such an image. Let's not choose necessarily a hamster, but let's go ahead to let's say, google.com and let's go to Google images and let's choose, well let's do hamster. Bigger hamster. Let's use this guy. He's pretty cute. Let's go ahead and right click or control click. There's different ways to do this. Using SFPT of course, you could download images that for which you have the appropriate copyright and such to download. We're just going to borrow the hamster for just a moment here. If I go into my web page now, recall that I have, hello world and goodbye world. Let me go ahead down below this and really, for no compelling reason, add a hamster. Image. [ laughter ] Source. So src is the special attribute that I know just from having read the documentation, that it modifies the behavior of the image tag, specifically telling it what image to display. All attributes, values have to be quoted, either with double quotes or with single quotes. Just be consistent throughout your files. I'm going to go ahead now and paste that URL. It kind of wraps around, but it in fact is, that's one hell of an URL. HowToTakeCareOfAHamster.com. Guess it was available. So let's go ahead and save this, hello.html, let's reload this and there he is. So now inside of our web page Now I'd argue, this page is pretty ugly and it's ugly because it's not centered. Right? So I want to go ahead and center all of the text here. [ laughter ] So how do you go about centering? Well one of the things you should be mindful of when teaching yourself more about HTML, we'll really just scratch the surface in lectures and even in sections. It gets very boring quickly to numerate dozens of tags and attributes that exist, stuff is much easier picked up just from a little reference that's online. We've linked to several on cs50.net/resources. It turns out there's a number of ways to accomplish different tasks. We're going to promote, at least in sections and lectures, some of the simplest approaches, some of the most common approaches. But realize, because HTML has been around now for years in version 1, version 2, version 5, you'll see people taking different approaches. Don't get confused by it. Thankfully there's a tool that I'll show you in a moment, which you can validate your HTML, that will tell you does your code as written adhere to current standards and it will point out syntax errors, similar in spirit to what gcc or even Valgrant [assumed spelling] might do for you. So it's a nice debugging tool of sorts. I'm going to go ahead now and specify this. Centering these days, back in the day, on WebHamster, you would just say this. So this for various reasons has kind of fallen out of favor in the web community and so you have to be a little more verbose and what you can do now is center this content by putting it in a div tag. So two of the most common tags in web development today are something called a div tag and something called a span tag. These are just structural tags. Whereas h1 and h5 say, make this big and bold, div and span just say, here comes some content. Content that I want to style in some way. So it allows you to group together related material, not unlike in C, the idea of a struct, it's not quite the same. But just a struct allows you to encapsulate some similar information, the idea of a div at least is to encapsulate some similar information. So here the information I want is the following lines of text and image. So I can actually do something like this, div and now I'm going to just indent this just to be proper and then down here I'm going to go and close the div tag. Now this is useless. Notice that if I save the file, reload, nothing has actually changed, but div stands for division. So I'm going to temporarily do this. It turns out that most tags support and attribute called style, that as the name implies, allows you to style them, aesthetically. There's all sorts of stylization that you can add to a tag including color and fonts size and margins and background colors and borders. A lot of purely aesthetic details. One of them though is called background color. So even though a moment ago I used bgcolor, this is kind of the old school way of using this attribute called bgcolor. Most people these days style almost all of their content using this style attribute, so that's a habit I'll now get into. Background-color, allows me to change just the color of this division of the page. So let's just go for hideous and say, yellow. So this div, which think of as literally a division of the page. They're always rectangle. So the following rectangle that's fitting all of the following content, the h1, the h5, and image, should have a background color of yellow. Let's save that. Reload and sure enough, there it is. Right? So this is not too far off from what I made in 1995 frankly. So notice there's some curiosities here. In the top there's some margins and these are frankly some stupid headaches when it comes to browser development. If we open this web page not only in Firefox, but Safari, Internet Explorer and Chrome, odds are you would see stupid, minute, little differences that nonetheless might tug at the nit-picky side of you, whereby on Firefox, feels like we have like a centimeter or so of red pixels, a margin around this. Well Internet Explorer, Microsoft might have decided, not going to be a centimeter, going to be a half a centimeter. So one of the headaches frankly early on in web development, is just understanding some of the stupid differences frankly, that the various browser manufacturers have made when it comes to implementing these tags. But for the most part, it's minor details. Though we'll see in JavaScript that sometimes it matters just a bit more. Thankfully there exists libraries that we've discussed in the context of C, that get rid of almost all of these cross platform headaches. So for now the point is just that I've made a division of the page, but I haven't aligned it. It turns out I want to say, text align. So if you want to have multiple properties, what I have just done here with background color:yellow, this is a CSS, Cascading Style Sheet property and that simply says, make the div background color yellow. If I want another such property, I can simply say ; and then I can say something like text-aligncenter and this will align now all of the text inside of that div as centered. So let's save this. Reload and sure enough, now things are centered. And again, there's different ways to do this, but the point for today is really to take away these ideas of tags and we've been looking at some tags already, attributes even though some of them have started to fall out of favor, in favor of the style tag and then as for these properties, well there's so many different properties out there and it's actually nice because almost always you can accomplish some goal with a property who's name kind of says what it does, but to pick up for these kinds of things, again be mindful of resources like this. If I go to the course's home page/resource and notice that we have under here like a tutorial on CSS. Again, CSS only refers to at the moment, these things inside of quotes, inside of the style tags value. And then we can go down here to HTML, where we have a bunch of tutorials on HTML and tags thereof. Just to give you a sense of this, let me go to HTML and let me pull up let's see, let's just give you a whirlwind tour. So it turns out you can make tables. Here is a silly little table for apples, bananas, oranges and other, but you can have some tabular structures in HTML with the right tags. You can have these familiar lists, you can have an automatically numbered list, you can have a bulleted list here on the right hand side and all of these are fairly easy. So just for the sake of some arbitrary examples, let's do a couple of more of these tags. My page again. I'm not shooting for pretty today, but let's suppose that I actually want, goodbye world, to take me away from this web page. I want it to be a so-called hyper-link. Well turns out making a hyper-link is actually pretty easy. So I can simply do something like this. I'm going to get rid of the h5, though I could leave it, but I just want to keep this simple. I'm going to get rid of h5 and instead I'm going to say you know what, the following should be an anchor. A for anchor, hyper reference, where do I want to go? Well anytime you leave a site it seems to go to like, let's go to Disney.com. All right, so now let's scroll over here, close bracket, that's not actually true. So when now when I click this text, goodbye world, this should take me to the value of the HRF attribute, which is this thing here. So I seem to have implemented the very familiar idea of a hyper-link. So let me go back to my web page. Reload. And hum, that was not what's intended. So notice, the goodbye world, it moved, but it seems to now be over here. So why is this? It looks like structurally I haven't changed the web page, but a web browser is kind of stupid. It only does what you tell it to do with these things called tags. And so in fact, even though it's implied by the h1 tag, which is a heading tag, this is a heading. So therefore give it it's own row, give it it's own line. With most tags, they don't get their own line and this is good because otherwise anytime you tried to co mingle tags in a web page, everything would be on it's own line. So that's not good behavior. So if you really want there to be a line break, in between that tag and that image, I have to tell the browser, put line break here. Open bracket, BR for break, closed bracket. Says put a line break here and now the browser will interpret this code literally. If I reload, now sure enough, I have the link that's just above the hamster and if I click this I'll be whisked away to Disney.com, which currently looks like this apparently. Anything interesting? OK. No, all right, so. So what are the take aways here? So now we have another tag, this idea of this anchor, which gives you the familiar hyper link. Well what are some of the other familiar tags? Any other feature of HTML you'd like to see? These are kind of the biggies right, this is kind of amazing frankly, what the world has constructed with some fairly simple ones. What was that? Fonts. OK, good one actually. Right now I seem to be preferring Times New Roman everywhere, but surely we have he ability to express other fonts and actually that reduces to CSS and let me just pull up the reference for CSS that we recommend on the site, though there are plenty and plenty of resources online. Let me go ahead to CSS, CSS tutorial and let me choose styling fonts. What you'll see is you have the ability per this tutorial, which actually do rather than just talk about, you have the ability to specify the font face, the font family that you want to display text in, the font size, the font color and a bunch of other details as well. But there is a gotcha. Just because you might have a font called Myriad Pro on your computer, that doesn't mean that your millions of users are similarly going to have the font. So you're actually somewhat limited these days to choosing frankly, ugly fonts like Times New Roman or you know, old common fonts like Times New Roman, but you at least have precision over, you can specify a list of fonts that the browser should try to use in order and use the first one that it actually finds. In fact, if we look at let's say Facebook.com, you'll see that this site actually looks a little different on Mac's and PC's, also depending on what fonts you have installed. Because if I can find this, let me see if I can make web development all the more real here by reloading this page, clicking CSS, let me try this. I'm going to try a little trick that we'll spend more time on to come, but it looks like Facebook uses, hopefully I can prove my point, yes. Thankfully, Mark has chosen the default font to be Lucinda Grand, followed by Tahoma if you don't have that, followed by Veranda, followed by Ariel, followed by the very generic Sans Serif and Sans Serif just means without Serif's, which means the little, cute little curves that a font might have. So this is Facebook's comma separated list of all of the fonts the browser should try to use in order to render their website and of course if you don't have some of those fonts, you might see a slightly different variant of this web page than someone else. But actually the fact that I pulled up Facebook here is actually a good teaser for one of the most useful features of websites. You don't just have hyper-links, you don't just have, you don't just have images and other mark up, you actually have the ability to take input from the user. For instance here we have what appear to be text boxes, up here we have what we call a check box, down here we have a select menu or a drop down menu. And so these are very familiar concepts. We use them every day of the week most likely, interacting with websites, but with these very basic building blocks, has the world built up some really neat capabilities. So first the font, then the capabilities. Suppose I really want to make let's say, goodbye, bigger. So it turns out, I can go into this anchor tag and I can add another attribute. If you want to have multiple attributes you just separate them by spaces. So attribute equals quote unquote value space attribute equals quote unquote value space. So if I want to style this tag, I say style equals quote unquote and then in the quotes I need to say something like fonts family and let's do Tahoma, whatever that looks like and then semi-colon font hyphen size colon let's say 96 points. All right save and again it's just wrapping with Nano here, but that is in fact all one line. All right, I guess my font's a little big. Let me go back to this hideous website, click reload and it's gotten even bigger. So again, you can do some real damage quickly with this, but it also suggests that you have really precise control over, it's actually not a bad web page now. So now. [ laughter ] Now you have, we'll save this one. All right, so now you have really precise control over the rendition of the page and so even though it feels a little messy, frankly it is a little messy what we're building up here, because we have HTML, we also have CSS and the CSS is just inside of the quote marks. These are two separate languages, HTML and CSS, that are getting co mingled. But realize that we can now actually get some input from the user. So literally, let me go to let's see, lectures/8/source. So literally the first thing that you might want to do is get input from the user. So here's how we might want to re implement Google ourself, right. This is pretty much what Google looked like back in 1999 or so. Let's go ahead and take a look at this web page's source code. And let's see how this site was implemented. As an aside, before I forget, I see that I accidentally capitalized doctype on all of these print outs, I will go back and make them consistent. Because I'm using lowercase for those doctype's up top. So FYI, I will fix in the print outs. But for now, notice how simple it is to implement Google. Right? So we have the head at the top of the web page, we then have a title called, Fake Google, just so I don't upset them. Now I have body and so in the body what did I do, div style="text-align: center", so this just means here comes a band of content, a division of content. Make it all centered and here is where this is this annoying semantic inconsistency. I'm saying textalign, but even things like the button you just saw, will also be centered as a result. So there's some you know, there's some quirks in the languages this year. But textaligncenter means make everything centered. I have an h1 tag which just means give me some big bold text to say, Fake Google and then this new tag. So here is yet another tag for us today. There's a form tag, open bracket form, the action attribute according to the manual says, give me the URL of a program, a web based program to which I should submit this user's form. So apparently I kind of cut a corner here. I didn't actually implement all of Google, I'm actually going to submit this form to the real Google and leverage their back end. So here I have form action equals quote unquote this, all right, so now close bracket. Notice that the close form is down here. So notice inside of this form is a bunch of input elements and this is where we have these basic user interface mechanisms. We have an input tag, a name I'm going to give q, q just meaning query, type="text", this means it's going to be a text box and not a button or check box or something like that. Then I have a line break which just says, put the next thing on the next line and what's the next thing. There's another input type that I found in the manual called type=submit and it turns out you can label this submit input, which happens to be rendered as a button with a value, namely, Fake Google search. So at the end result is that I have a form not unlike Google version 1.0. I can type in something like hamster here, click Fake Google search and let's actually see what happens. Well let me pull up Firebug, just so I can sniff my own network traffic and see what's going on. I'm going to go ahead and click on Fake Google search and notice what happens at top to the URL, because of that action attribute. Fake Google search. Wha-la, I have implemented Google right, I have my own front end now to Google, but notice how this worked. What's interesting is that I indeed ended up at google.com, no surprise there. But notice what the URL looks like, http://www.google.com/search, that was in my action attribute, ?q=hamster. So it turns out, when we've been using this special key word get, get/, well you don't have to just get/, you can get/search and in fact, if you add a question mark to almost any URL, you can then pass in a ampersand separated list of what are called, parameters. So this is again just another word for the notion of an argument or a parameter or it is a parameter, another word for an argument but in the context of the web, question mark means to the browser or to the server, here comes some parameters, here comes some arguments. What's the first argument? q. What's it's value? Well it equals hamster. And so this is the way that a web browser, which is just a fairly dumb program sitting somewhere on the Internet on my little old computer, passes input into a web server. Because web servers meanwhile, are designed to look at the incoming requests and it see oh, get/search?q=hamster, so the program that Google wrote, which is probably written here in Python or a similar language, they see oh, here's a parameter called q, q is hamster, let me search my database for this keyword hamster, find a big list, a link list or an array of all of the web pages that match and then you know what I want to do, I want to actually return to the user a web page, little white lie I'm about to tell, a web page that contains a whole bunch of HTML with which to render that there web page. Now it's a little white lie because Google as you might have noticed, now has instant search and all these fancy features, where you actually don't download the HTML all at once when you first visit the page, rather they get it using a technology called, Ajax, whereby it gets the data behind the scenes and then immediately integrates it into the website. But if I do this old school, and turn off instant search and actually now research for hamster and reload my page's source code, you'll notice that, oh that's kind of a long thing. Let's go and put this in to here. This is the code, which is a mingling of HTML and JavaScript, that I got back from Google's website, when I asked for all web pages related to hamsters. It's cryptic. In fact, for someone like Google, when you get a billion hits per day, if every hit, this is actually a neat curiosity. With someone like Google, if a programmer accidentally or just to be nice and pretty, hits the space bar just to indent something, that's one additional character, one additional byte, magnified by a billion visits per day, that's 1 gigabyte worth of downloads that Google now has to incur and then spend for and so these days, super fancy websites like Google and Facebook which have a ridiculous amount of traffic for which every little penny and every little bit counts, they do what's called minify their source code. So it's a little harder to learn from these websites sometimes unless you use this tool like Firebug. If I go to Firebug and view this same web page down here, notice one of the powers of Firebug. No matter what, how big a mess someone's web page is, notice what Firebug does is it clean it up for you. So this is in fact how Google is implemented and as promised, notice that they make ample use of these things called div tags. It looks like they have a few other attributes like ID and class, but we've really just scratched the surface. What we'll tease you with today is this last thing here. Literally, the very first thing I learned how to do on with regard to web programming, was something that looked like this. Back in the day, you would walk across the yard to Wigglesworth, hand in a piece of paper into a proctors drop box if you wanted to register for a Freshman inter mural sport, in 1996 we put the Freshman Inter mural Sports program onto the Internet and it looked a little something like this. Literally. Although, I think I had like big pictures of soccer balls and volley balls and image in the background just to make it cool. It wasn't so cool. But I'll find that image for Wednesday. But using these same building blocks, inputtype=text, inputtype=submit, you can actually then take input from the user, let them hit submit, get an email to yourself, get their name added to the database, all the sorts of things you might want to do for final projects. Some more on that on Wednesday.