= Week 7 :author: Anna Whitney :v: GUtPQIDSwrA [t=0m0s] == Introduction * In Problem Set 4, you were challenged to find as many of the staff from the photos you recovered and take selfies with them. Ken, a photographer on the CS50 staff, got selfies with 24 staff members! Of actual students, the winners were Lance, with 15 photos, and Bonnie, with 14 photos but some containing multiple staff. Their sections will get a catered lunch. * *CS50 Lunch* on Friday, as usual; RSVP at http://cs50.harvard.edu/rsvp[cs50.harvard.edu/rsvp]. * *CS50 Seminars* are coming up! ** We're getting to the point of the semester where you should start to think about final project ideas. ** To this end, *pre-proposals* are soon due, in which you'll give your TF some ideas of what you're thinking about for a final project. ** Many students end up doing web projects, and since we're just now getting into web programming, don't worry if you don't yet know exactly how you'd implement your ideas (your TF will help you figure out what's feasible). ** CS50 Seminars give you a chance to explore tools and languages that aren't covered in the course itself, many of which might be of use on a final project. ** To see the list of seminars and register for any of them, go to http://cs50.harvard.edu/register[cs50.harvard.edu/register]. [t=4m11s] == HTTP * We left off last time with `GET`, the messages that are actually passed in the "envelope", so to speak, from computer to computer. ** A `GET` request looks something like this: + [source] ---- GET / HTTP/1.1 Host: www.google.com ... ---- *** This first line, `GET / HTTP/1.1`, means that we're requesting the main page of the website - the index, or default webpage - using the `HTTP/1.1` protocol. ** The server would respond with something like the following: + [source] ---- HTTP/1.1 200 OK Content-Type: text/html ... ---- *** We rarely see the status code `200 OK` as humans, because it means everything went fine, so the browser just shows us the webpage content. *** The most common content type is `text/html`, which we'll discuss in more detail now. [t=5m35s] == HTML * In the coming weeks, you can use any browser, but we recommend Chrome because it has lots of useful tools for developers. * In Chrome, you can right-click/Ctrl-click on a webpage and select "inspect element" from the menu that appears to open Chrome's Inspector, a debugging tool that lets you see under the hood of what's happening on a webpage. ** The Inspector has lots of features, but let's look at the *Network* tab. *** We press the recording button so it's red, and check "preserve log", so that everything that happens with the network will be recorded. Then we go to www.facebook.com: + image::network.png[alt="Network tab in Chrome inspector", width=800] *** When you visit facebook.com, you don't just send one `GET` request, you actually send many `GET` requests to load different parts of the page. *** The one we care about, though, is the request for the main Facebook page itself. Let's look at the actual header that Chrome sent to www.facebook.com: + image::request.png[alt="Request header sent to Facebook"] *** Although this isn't formatted quite the way we wrote it before, we can see that it's the same kind of request, with the path `/` again representing the main page of the website, and `HTTP/1.1` being the protocol used. *** The inspector also lets us look at the response headers that Facebook returned to our browser. *** These headers are composed of many key-value pairs (including the one we're familiar with from before: `content type: text/html`. * We can also right-click on the page and select "View Page Source" to see the HTML source of the webpage (this is not limited to Chrome - Firefox, IE, etc. all let you do this as well, although the menu options might look a little different). ** The HTML that we see this way is sent in the response from Facebook, coming after those headers we saw in the inspector. * Recall our simple webpage from last lecture, with this HTML source: + [source, html, numbered] ---- hello, world hello, world ---- * Our code looks much prettier than the Facebook source code - we use indentation to make our HTML more readable, but the Facebook source is all strung together without whitespace in between. This is because whitespace takes up space! The file without the whitespace is smaller (fewer bytes) than the file with whitespace, because whitespace characters cost bytes too. ** If just one extra space character were included in the HTML for Facebook's homepage, and they have a billion users who all access the homepage, that's an extra _gigabyte_ of data that needs to be sent from Facebook's servers! ** For this reason, it's common in web development to *minify* your code, or remove any superfluous characters to make the file sizes as small as possible. ** As we're learning web programming, we'll start out by writing our code in a more human-readable fashion. (As an aside, nobody writes code by hand that looks like what we saw in the Facebook source - you write code that's human-readable, and once you've got it working, you run it through a program called a minifier that turns it into the uglier, but smaller version that actually gets sent across the network.) * If we look in the Elements tab of the Chrome inspector, we can see the *pretty-printed* version of the HTML source - Chrome has conveniently de-obfuscated it for us. ** We can start to see the hierarchy of the webpage and click through it in this view. * So as we did last time, let's type our simple HTML page into a text editor and save it as `hello.html`: + [source, html, numbered] ---- hello, world hello, world ---- ** Then we can double-click this file and visit this webpage in Chrome. [t=13m32s] == Web Servers * The problem with this webpage we've just created is that we're the only ones who can see it, because it's just a file on David's Mac. It's not accessible outside of his computer (or your computer, if you follow the above steps on your own computer). * Let's look at Cloud9, the service that hosts the CS50 IDE. It has all of our workspaces hosted somewhere on the Internet, so any files we create in the IDE are already publicly accessible. * If we copy our same HTML document above into a new file in the CS50 IDE and save it as `hello.html`, how can we open it as a webpage? * Built into the IDE, in addition to the debugger and all the other tools available to you, is a full-fledged *web server*. ** A web server is just a program whose job it is to serve up web pages - to listen for requests from other computers and respond with the virtual envelopes containing the headers and page content. ** Our IDE's web server is an https://en.wikipedia.org/wiki/Open-source_software[open-source] server called *Apache*, to which we've written a more usable interface called `apache50`. * In the IDE's terminal, we can type the following: + [source] ---- jharvard@ide50:~/workspace $ apache50 start . Setting Apache's document root to /home/ubuntu/workspace ... * Starting web server apache2 * Apache started successfully! Your site is now available at https://ide50-jharvard.c9.io jharvard@ide50:~/workspace $ ---- * Our web server is now listening on the web at the address `ide50-jharvard.c9.io` (your address will be a little different depending on your username), on TCP port 80. ** You can see what this address will be in the upper-right corner of your IDE. * If we click on the URL in question, we see a pretty ugly index - just a directory listing. But we can see `hello.html` saved in this directory, and if we click on it, we can see the same "hello, world!" page as before. ** Note that we're looking at this webpage inside the IDE - not because it's hosted there (since our web server makes it visible to the whole world), like when `hello.html` was on David's computer and we were looking at it there, but because in addition to the web server, the IDE also provides a simple web browser. ** We can also copy-paste this URL into the regular Chrome browser bar, and we'll see the exact same thing! (You could even have a friend on an entirely different computer go to your IDE's address while you're running your web server, and they'll be able to see the same thing.) * For the purposes of the course, you have your own unique address, a *subdomain* of the course's overarching domain, cs50.io (AKA ide50.c9.io), represented as `ide50-username.c9.io`. [t=18m20s] == Working with HTML * Let's look at a slightly more complicated HTML file, http://cdn.cs50.net/2015/fall/lectures/7/m/src7m/paragraphs.html[`paragraphs.html`] (note that if you click on that link, you'll likely have to view source to see the actual HTML code, since your browser knows how to interpret HTML and so does it for you rather than showing you the raw source): + [source, html, numbered] ---- paragraphs

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam in tincidunt augue. Duis imperdiet, justo ac iaculis rhoncus, erat elit dignissim mi, eu interdum velit sapien nec risus. Praesent ullamcorper nibh at volutpat aliquam. Nam sed aliquam risus. Nulla rutrum nunc augue, in varius lacus commodo in. Ut tincidunt nisi a convallis consequat. Fusce sed pulvinar nulla.

Ut tempus rutrum arcu eget condimentum. Morbi elit ipsum, gravida faucibus sodales quis, varius at mi. Suspendisse id viverra lectus. Etiam dignissim interdum felis quis faucibus. Integer et vestibulum eros, non malesuada felis. Pellentesque porttitor eleifend laoreet. Duis sit amet pellentesque nisi. Aenean ligula mauris, volutpat sed luctus in, consectetur id turpis. Phasellus mattis dui ac metus blandit volutpat. Donec lorem arcu, sollicitudin in risus a, imperdiet condimentum augue. Ut at facilisis mauris. Curabitur sagittis augue in dictum gravida. Integer sed sem sed justo tempus ultrices eu non magna. Phasellus semper eros erat, a posuere nisi auctor et. Praesent dignissim orci aliquam laoreet scelerisque.

Mauris eget erat arcu. Maecenas ac ante vel ipsum bibendum varius. Nunc tristique nulla eget tincidunt molestie. Morbi sed mauris eu lectus vehicula iaculis ac id lacus. Etiam sit amet magna massa. In pulvinar sapien ac mi ultrices, quis consequat nisl hendrerit. Aliquam pharetra nec sem non vehicula. In et risus leo. Ut tristique ornare nisl et lacinia.

---- ** First, notice in lines 3-12, we have a comment, just as we had in our C code, but the format is a little different: comments are opened with `]`, instead of being opened with `/pass:[*]` and closed with `pass:[*]/`. ** We've introduced a new tag - `

`, or paragraph. * If we restart our web server using today's `src7m` directory as the root directory, we'll be able to see this and all the other source files: + [source] ---- jharvard@ide50:~/workspace $ apache50 stop * Stopping web server apache2 * jharvard@ide50:~/workspace $ apache50 start src7m/ Setting Apache's document root to /home/ubuntu/workspace/src7m ... * Starting web server apache2 * Apache started successfully! Your site is now available at https://ide50-jharvard.c9.io jharvard@ide50:~/workspace $ ---- * Now when we go to `ide50-jharvard.c9.io`, we see a much longer list of files (the files in the `src7m` directory), among them `paragraphs.html`. * If we click on `paragraphs.html`, we can see that those `

` tags make line breaks between our pseudo-Latin paragraphs. ** If we replace the `

` tags with actual line breaks in our HTML source, the paragraphs all run together, because the browser just ignores extra whitespace - it only does exactly what it's told to do. * HTML syntax seems to consist of opening a tag (e.g., `` or `

`) and then closing those tags (e.g., ``, `

`). Start tags just consist of a tag name in angle brackets, and end tags are similar, but the tag name is prefixed with a forward slash. * What if we want more than just plain, 12pt Times New Roman text? We can make our text bigger and bold, as in http://cdn.cs50.net/2015/fall/lectures/7/m/src7m/headings.html[`headings.html`]: + [source, html] ---- headings

One

Two

Three

Four

Five
Six
---- ** We're not directly bolding our text or increasing the size of our text, but we're using *heading* tags, `

` through `

`, and our browser knows that each heading should be bold and a slightly different size. * We can also make lists, as in http://cdn.cs50.net/2015/fall/lectures/7/m/src7m/list.html[`list.html`]: + [source, html] ---- list ---- ** This displays a bulleted list of houses on the Quad. The `