1 00:00:00,000 --> 00:00:02,405 >> [MUSIC PLAYING] 2 00:00:02,405 --> 00:00:10,439 3 00:00:10,439 --> 00:00:11,980 DAVID MALAN: All right, this is CS50. 4 00:00:11,980 --> 00:00:13,740 This is the end of week eight. 5 00:00:13,740 --> 00:00:15,887 And today, we start to fill in some pieces 6 00:00:15,887 --> 00:00:17,720 when it comes to building things on the web. 7 00:00:17,720 --> 00:00:20,020 So, recall that on Monday we spend a lot more time 8 00:00:20,020 --> 00:00:22,530 on PHP, which is this dynamic programming language that 9 00:00:22,530 --> 00:00:26,872 lets us output, among other things, HTML and other such content 10 00:00:26,872 --> 00:00:27,830 that we'll want to see. 11 00:00:27,830 --> 00:00:30,871 But we haven't really looked at how we're going to store any information. 12 00:00:30,871 --> 00:00:34,477 Indeed, almost any of that the super interesting websites you visit today 13 00:00:34,477 --> 00:00:36,560 have some kind of database on the back end, right? 14 00:00:36,560 --> 00:00:39,540 Facebook certainly stores lots of data about all of us and Gmail stores all 15 00:00:39,540 --> 00:00:40,210 of your emails. 16 00:00:40,210 --> 00:00:44,150 >> And so, many other sites aren't just static content that's informational. 17 00:00:44,150 --> 00:00:45,640 It's actually dynamic in some way. 18 00:00:45,640 --> 00:00:48,480 You provide input, it updates the pages for other people. 19 00:00:48,480 --> 00:00:50,620 You get messages, you send messages, and so forth. 20 00:00:50,620 --> 00:00:54,250 So today, we look closer at the underpinnings of a project 21 00:00:54,250 --> 00:00:57,330 that you'll dive into next week, CS50 Finance, which 22 00:00:57,330 --> 00:01:00,509 is actually going to have you build something not in C, but in PHP. 23 00:01:00,509 --> 00:01:02,550 A website that looks a little something like this 24 00:01:02,550 --> 00:01:05,810 that allows to buy and sell stocks that are actually 25 00:01:05,810 --> 00:01:09,044 going to draw upon real time stock data from Yahoo Finance. 26 00:01:09,044 --> 00:01:11,960 And so ultimately, you'll have the illusion for yourself and for users 27 00:01:11,960 --> 00:01:14,550 that you're actually buying and selling stocks and getting nearly real time 28 00:01:14,550 --> 00:01:16,800 updates, managing a portfolio, all of which 29 00:01:16,800 --> 00:01:20,310 is going to require having, ultimately, a database of users. 30 00:01:20,310 --> 00:01:23,330 >> So, in your own words, especially if you're not 31 00:01:23,330 --> 00:01:25,670 super familiar with computer science or databases, what 32 00:01:25,670 --> 00:01:30,790 do you know a database to be right now, in nontechnical terms? 33 00:01:30,790 --> 00:01:32,300 What is it? 34 00:01:32,300 --> 00:01:36,882 How would you describe it to a roommate or a friend? 35 00:01:36,882 --> 00:01:40,100 >> AUDIENCE: [INAUDIBLE] information [INAUDIBLE] 36 00:01:40,100 --> 00:01:44,430 >> DAVID MALAN: So, a list of information, or a store-- a list of information 37 00:01:44,430 --> 00:01:47,160 that you might want to store about something, like a user. 38 00:01:47,160 --> 00:01:50,190 And what do users have associated with them? 39 00:01:50,190 --> 00:01:53,160 If you're a user on Facebook or Gmail, what are the characteristics 40 00:01:53,160 --> 00:01:54,940 that all of us users have? 41 00:01:54,940 --> 00:01:58,530 Like, what might be some of the columns in the spreadsheet to which we 42 00:01:58,530 --> 00:01:59,390 alluded last time? 43 00:01:59,390 --> 00:02:01,140 Because again, you can think of a database 44 00:02:01,140 --> 00:02:05,810 really as a fancy Excel file or Google Spreadsheet or Apple Numbers file. 45 00:02:05,810 --> 00:02:08,280 >> So, what do you think of when you think of a user? 46 00:02:08,280 --> 00:02:11,290 What do they have? 47 00:02:11,290 --> 00:02:11,790 What's that? 48 00:02:11,790 --> 00:02:12,470 >> AUDIENCE: A name. 49 00:02:12,470 --> 00:02:13,303 >> DAVID MALAN: A name. 50 00:02:13,303 --> 00:02:16,840 So if name, like, David Malan would be the name of some user. 51 00:02:16,840 --> 00:02:17,980 What else does a user have? 52 00:02:17,980 --> 00:02:18,770 >> AUDIENCE: An ID. 53 00:02:18,770 --> 00:02:19,561 >> DAVID MALAN: An ID. 54 00:02:19,561 --> 00:02:23,320 So, like an ID number, like your Harvard ID or your Yale Net ID or the like. 55 00:02:23,320 --> 00:02:24,923 What else might a user have? 56 00:02:24,923 --> 00:02:25,890 >> AUDIENCE: Password. 57 00:02:25,890 --> 00:02:29,240 >> DAVID MALAN: A password, maybe an address, maybe a phone number, maybe 58 00:02:29,240 --> 00:02:30,050 an email address. 59 00:02:30,050 --> 00:02:32,640 So, there's bunches of fields and this could sort of spiral out of control 60 00:02:32,640 --> 00:02:34,760 quickly as soon as you start realizing, oh, let's store this 61 00:02:34,760 --> 00:02:36,190 and let's store this and that. 62 00:02:36,190 --> 00:02:37,657 >> But how do we actually do that? 63 00:02:37,657 --> 00:02:39,740 So again, the mental model to have for today as we 64 00:02:39,740 --> 00:02:42,320 dive into actual SQL, Structured Query Language, 65 00:02:42,320 --> 00:02:44,186 is a database that looks like this. 66 00:02:44,186 --> 00:02:45,310 It's just rows and columns. 67 00:02:45,310 --> 00:02:48,309 And you can imagine Google Spreadsheets or any number of other programs. 68 00:02:48,309 --> 00:02:52,130 But what's key about MySQL, which is the database software we're going to use, 69 00:02:52,130 --> 00:02:54,920 the freely openly available-- Facebook uses 70 00:02:54,920 --> 00:02:59,200 it and any number of other websites-- database stores things relationally. 71 00:02:59,200 --> 00:03:01,770 And a relational database just means one that literally 72 00:03:01,770 --> 00:03:03,672 stores its data in rows and columns. 73 00:03:03,672 --> 00:03:04,630 It's as simple as that. 74 00:03:04,630 --> 00:03:07,230 >> So, even something like Oracle that you might have generally heard of 75 00:03:07,230 --> 00:03:08,271 is a relational database. 76 00:03:08,271 --> 00:03:10,929 And underneath the hood, it stores data in rows and columns. 77 00:03:10,929 --> 00:03:12,970 And Oracle charges you a lot of money to do that, 78 00:03:12,970 --> 00:03:15,550 whereas MySQL charges you nothing for the same. 79 00:03:15,550 --> 00:03:18,790 So, SQL is going to give us at least four operations. 80 00:03:18,790 --> 00:03:23,190 The ability to select data, like read data, insert, delete, and update data. 81 00:03:23,190 --> 00:03:25,525 In other words, those are really the four key operations 82 00:03:25,525 --> 00:03:28,950 that are going to allow us to change stuff in those rows and columns. 83 00:03:28,950 --> 00:03:33,250 >> The tool that we'll use today especially to learn SQL and to play with it 84 00:03:33,250 --> 00:03:34,627 is again called PHP MyAdmin. 85 00:03:34,627 --> 00:03:35,460 It's web based tool. 86 00:03:35,460 --> 00:03:38,200 Total coincidence that it's written in PHP. 87 00:03:38,200 --> 00:03:42,400 But it's going to give us a graphical user interface so that we can actually 88 00:03:42,400 --> 00:03:46,054 create these rows and columns and then talk to them via code. 89 00:03:46,054 --> 00:03:47,970 So, let's now start to what I think is frankly 90 00:03:47,970 --> 00:03:51,000 kind of the fun process of building the back end of websites, 91 00:03:51,000 --> 00:03:54,580 the parts that users don't see, but surely do care about, 92 00:03:54,580 --> 00:03:56,170 because that's rather data's going. 93 00:03:56,170 --> 00:03:59,570 So, similar to C and a little less like PHP, 94 00:03:59,570 --> 00:04:04,954 SQL, or a database that supports SQL, has at least these data types 95 00:04:04,954 --> 00:04:05,870 and bunches of others. 96 00:04:05,870 --> 00:04:08,107 CHAR, VARCHAR, INT, BIGINT, DECIMAL, and DATETIME. 97 00:04:08,107 --> 00:04:09,940 And there's a whole bunch of other features, 98 00:04:09,940 --> 00:04:11,940 but let's do this by way of actual example. 99 00:04:11,940 --> 00:04:16,450 >> I'm going to go into CS50 IDE where, in advance, I've logged in 100 00:04:16,450 --> 00:04:19,372 and I've also visited a URL for this tool called PHP MyAdmin. 101 00:04:19,372 --> 00:04:22,580 And in problem set seven, we'll tell you exactly how to get to this interface 102 00:04:22,580 --> 00:04:23,200 as well. 103 00:04:23,200 --> 00:04:25,640 On the top left hand corner, notice it says lecture. 104 00:04:25,640 --> 00:04:27,610 And that just means that in advance, I created 105 00:04:27,610 --> 00:04:31,360 an empty database called lecture that has no spreadsheets in it yet. 106 00:04:31,360 --> 00:04:32,600 There's no rows and columns. 107 00:04:32,600 --> 00:04:34,308 Because the first thing we're going to do 108 00:04:34,308 --> 00:04:37,100 is start to create a table that's going to store our users. 109 00:04:37,100 --> 00:04:39,100 >> So, literally over here to the right, I'm 110 00:04:39,100 --> 00:04:42,070 going to tell the database I want a table called Users. 111 00:04:42,070 --> 00:04:44,845 So, this is like the file that I want to store all of my data in. 112 00:04:44,845 --> 00:04:45,720 And how many columns? 113 00:04:45,720 --> 00:04:47,740 Well, let's keep it simple for now. 114 00:04:47,740 --> 00:04:51,855 I just want to store like a username and a name for a user. 115 00:04:51,855 --> 00:04:53,020 We'll start small. 116 00:04:53,020 --> 00:04:55,370 So, I want two columns total. 117 00:04:55,370 --> 00:04:57,360 And I'm going to go ahead and click Go. 118 00:04:57,360 --> 00:04:59,210 And then, for these columns, what I'm going 119 00:04:59,210 --> 00:05:04,576 to do-- if this internet cooperates-- all right, 120 00:05:04,576 --> 00:05:05,950 so we're going to try that again. 121 00:05:05,950 --> 00:05:09,180 I'm going to create a table called Users with two columns, click Go, OK. 122 00:05:09,180 --> 00:05:10,520 Now we've got it really fast. 123 00:05:10,520 --> 00:05:12,065 Thank you, very well done. 124 00:05:12,065 --> 00:05:14,440 All right, so what do we want these columns to be called? 125 00:05:14,440 --> 00:05:16,080 >> So, one's going to be called Username. 126 00:05:16,080 --> 00:05:19,480 So, all I see here-- and the interface frankly gets a little ugly eventually, 127 00:05:19,480 --> 00:05:21,270 once you start typing in all this data. 128 00:05:21,270 --> 00:05:27,450 But what's nice is that sort of paradoxically, I'm creating columns, 129 00:05:27,450 --> 00:05:29,977 but the tool has foolishly laid them out in rows 130 00:05:29,977 --> 00:05:31,560 so that I can configure these columns. 131 00:05:31,560 --> 00:05:33,550 So, there's two blanks there under Name. 132 00:05:33,550 --> 00:05:36,180 And one of these fields I want to called Username, 133 00:05:36,180 --> 00:05:38,000 and the other field I want to call Name. 134 00:05:38,000 --> 00:05:40,340 >> And now I have to choose data types for these things. 135 00:05:40,340 --> 00:05:42,330 So, whereas in Excel and Google Spreadsheets, 136 00:05:42,330 --> 00:05:45,531 if you want a column, you literally just type Name or Username, hit Enter. 137 00:05:45,531 --> 00:05:48,030 Maybe you make it bold face just for clarity, but that's it. 138 00:05:48,030 --> 00:05:50,140 You don't specify the types of the columns. 139 00:05:50,140 --> 00:05:53,790 Now in Google Spreadsheets or Excel, you might specify how the data is rendered. 140 00:05:53,790 --> 00:05:58,120 You could go to the Format menu, and you can specify show this like dollar sign, 141 00:05:58,120 --> 00:05:59,900 show this like a floating point value. 142 00:05:59,900 --> 00:06:01,990 >> So, it's similar in spirit to that what we're about to do, 143 00:06:01,990 --> 00:06:04,740 but this is actually going to force the data to be a certain type. 144 00:06:04,740 --> 00:06:07,750 Now, even though a moment ago I said there's only a few data types, 145 00:06:07,750 --> 00:06:11,120 there's actually a whole lot, and they're 146 00:06:11,120 --> 00:06:12,910 in varying degrees of specificity. 147 00:06:12,910 --> 00:06:14,970 And as an aside, you can even do fancy things 148 00:06:14,970 --> 00:06:17,520 like storage geometries inside of a database. 149 00:06:17,520 --> 00:06:19,250 You can store things like GPS coordinates 150 00:06:19,250 --> 00:06:22,420 and actually find, mathematically, points that are close to others. 151 00:06:22,420 --> 00:06:24,128 But we're going to keep this super simple 152 00:06:24,128 --> 00:06:26,800 and go up to here, all of the so-called string types. 153 00:06:26,800 --> 00:06:29,240 >> So, here's a list of a whole bunch of options. 154 00:06:29,240 --> 00:06:32,740 CHAR, VARCHAR, TINYTEXT, MEDIUMTEXT, LONGTEXT. 155 00:06:32,740 --> 00:06:34,110 And it's kind of overwhelming. 156 00:06:34,110 --> 00:06:37,610 And unfortunately, somewhat paradoxically to C, 157 00:06:37,610 --> 00:06:40,120 a CHAR is not really a CHAR. 158 00:06:40,120 --> 00:06:44,170 If you specify in a database that your data type is CHAR, 159 00:06:44,170 --> 00:06:47,390 that means that yes, it's a CHAR, but it's one or more CHARs. 160 00:06:47,390 --> 00:06:49,630 And you have to specify how many CHARs you want. 161 00:06:49,630 --> 00:06:51,636 So, what's a typical length for a username? 162 00:06:51,636 --> 00:06:52,760 Is there a limit typically? 163 00:06:52,760 --> 00:06:53,920 >> AUDIENCE: [INAUDIBLE] 164 00:06:53,920 --> 00:06:55,050 >> DAVID MALAN: 16 maybe? 165 00:06:55,050 --> 00:06:55,990 Something like that. 166 00:06:55,990 --> 00:06:57,948 You know, back in the day, it used to be eight. 167 00:06:57,948 --> 00:07:00,289 Sometimes it's 16, sometimes it's even more than that. 168 00:07:00,289 --> 00:07:02,080 And so, this doesn't mean give me one CHAR. 169 00:07:02,080 --> 00:07:04,730 This means I have to specify the length of the field, 170 00:07:04,730 --> 00:07:07,402 and now I might say something like 16. 171 00:07:07,402 --> 00:07:08,610 And there's a trade off here. 172 00:07:08,610 --> 00:07:11,360 So, we'll see in a moment that this means one, 173 00:07:11,360 --> 00:07:14,620 every username has to be 16 characters. 174 00:07:14,620 --> 00:07:18,720 But wait a minute, M-A-L-A-N. If that's my username and I'm only using five, 175 00:07:18,720 --> 00:07:23,070 what would you propose that the database to do for the other 11 characters that 176 00:07:23,070 --> 00:07:24,471 I've reserved space for? 177 00:07:24,471 --> 00:07:25,220 What would you do? 178 00:07:25,220 --> 00:07:26,480 >> AUDIENCE: [INAUDIBLE] 179 00:07:26,480 --> 00:07:27,160 >> DAVID MALAN: Yeah, just make them all null. 180 00:07:27,160 --> 00:07:28,290 Make them spaces. 181 00:07:28,290 --> 00:07:30,816 But probably null, so a lot of backslash zeros. 182 00:07:30,816 --> 00:07:33,190 So, on the one hand, we've now made sure that my username 183 00:07:33,190 --> 00:07:34,780 can be no more than 16 characters. 184 00:07:34,780 --> 00:07:37,590 And the flip side of that is that if I had a really long name 185 00:07:37,590 --> 00:07:39,940 or wanted a really long username like some of you 186 00:07:39,940 --> 00:07:44,840 guys might have in that college or at Yale.edu, you can't have one. 187 00:07:44,840 --> 00:07:47,177 And so in fact, if you've ever registered for a website 188 00:07:47,177 --> 00:07:49,385 and you get yelled at saying your password's too long 189 00:07:49,385 --> 00:07:52,710 or your username's too long, it's simply because a programmer, when 190 00:07:52,710 --> 00:07:55,500 configuring his or her database, decided that this field will 191 00:07:55,500 --> 00:07:57,150 be no longer than this length. 192 00:07:57,150 --> 00:08:00,580 >> All right, so what if we proceed to name? 193 00:08:00,580 --> 00:08:05,240 How long should a typical human's name be? 194 00:08:05,240 --> 00:08:07,492 How many characters, 16? 195 00:08:07,492 --> 00:08:09,450 I'm guessing we could find someone in this room 196 00:08:09,450 --> 00:08:13,210 where by his or her first plus last name is longer than 16 characters. 197 00:08:13,210 --> 00:08:14,850 So, what's better than that, 17? 198 00:08:14,850 --> 00:08:17,040 18? 199 00:08:17,040 --> 00:08:18,830 25? 200 00:08:18,830 --> 00:08:20,350 Bigger? 201 00:08:20,350 --> 00:08:20,980 30? 202 00:08:20,980 --> 00:08:21,855 >> AUDIENCE: [INAUDIBLE] 203 00:08:21,855 --> 00:08:23,700 DAVID MALAN: 5,000, oh my God. 204 00:08:23,700 --> 00:08:26,309 So, that's probably a decent upper bound, shall we say. 205 00:08:26,309 --> 00:08:28,350 And here we kind of have to make a judgment call. 206 00:08:28,350 --> 00:08:30,400 Like, there's no right answer here. 207 00:08:30,400 --> 00:08:32,740 Infinite is not quite possible, because we're eventually 208 00:08:32,740 --> 00:08:34,781 going to have-- we're going to run out of memory. 209 00:08:34,781 --> 00:08:36,909 So, we have to make a judgment call at some point. 210 00:08:36,909 --> 00:08:41,010 >> Very common would be, for instance, to use-- and let me specify CHAR here 211 00:08:41,010 --> 00:08:46,050 as before-- 255 was literally the upper limit on this database software 212 00:08:46,050 --> 00:08:46,700 years ago. 213 00:08:46,700 --> 00:08:48,575 And so, a lot of humans would just say, fine. 214 00:08:48,575 --> 00:08:49,420 255's the limit. 215 00:08:49,420 --> 00:08:50,620 Let's just use the maximum. 216 00:08:50,620 --> 00:08:51,870 And this is pretty ridiculous. 217 00:08:51,870 --> 00:08:55,060 Like, if you're typing someone's name for 200 plus characters, 218 00:08:55,060 --> 00:08:56,140 that a little ridiculous. 219 00:08:56,140 --> 00:08:59,624 >> But, remember that ASCII is not the only system for characters. 220 00:08:59,624 --> 00:09:01,540 And so, especially in a lot of Asian languages 221 00:09:01,540 --> 00:09:04,248 where there's characters we can't express on keyboards like my US 222 00:09:04,248 --> 00:09:08,209 keyboard, some characters actually take up 16 bits instead of eight bits. 223 00:09:08,209 --> 00:09:10,250 And so, this actually isn't all that unreasonable 224 00:09:10,250 --> 00:09:12,250 that we need more space if we want to fit 225 00:09:12,250 --> 00:09:16,252 larger characters than the very US centric ones we've tended to discuss. 226 00:09:16,252 --> 00:09:17,460 So, we need some upper bound. 227 00:09:17,460 --> 00:09:21,470 I don't know what the best one is, but 255 is generally a common one. 228 00:09:21,470 --> 00:09:22,700 25 feels low. 229 00:09:22,700 --> 00:09:23,857 16, 32 feel low. 230 00:09:23,857 --> 00:09:25,690 I would err on the side of something higher. 231 00:09:25,690 --> 00:09:27,330 But there's a trade off, as always. 232 00:09:27,330 --> 00:09:31,902 What is the, perhaps, obvious trade off of reserving 255 CHARs 233 00:09:31,902 --> 00:09:33,360 for everyone's name in my database? 234 00:09:33,360 --> 00:09:34,230 >> AUDIENCE: [INAUDIBLE] 235 00:09:34,230 --> 00:09:34,510 >> DAVID MALAN: What's that? 236 00:09:34,510 --> 00:09:35,430 >> AUDIENCE: [INAUDIBLE] 237 00:09:35,430 --> 00:09:37,138 >> DAVID MALAN: It's a lot of memory, right? 238 00:09:37,138 --> 00:09:42,280 M-A-L-A-N. I have just wasted 250 characters just to store my name 239 00:09:42,280 --> 00:09:46,000 defensively, just in case someone in the class has a really long name. 240 00:09:46,000 --> 00:09:47,940 That seems like an undue tradeoff. 241 00:09:47,940 --> 00:09:52,040 >> So, it turns out that SQL, this database language, 242 00:09:52,040 --> 00:09:55,480 actually supports something called VARCHAR, or Variable CHAR. 243 00:09:55,480 --> 00:09:59,390 And this is kind of nice in that this allows you to specify not a fixed 244 00:09:59,390 --> 00:10:01,900 width, but rather, a variable width. 245 00:10:01,900 --> 00:10:05,060 And more specifically, a maximum width of the field. 246 00:10:05,060 --> 00:10:08,901 So, this means that a name can be no more than 250 characters, 247 00:10:08,901 --> 00:10:10,150 but it can certainly be fewer. 248 00:10:10,150 --> 00:10:11,733 And the database is going to be smart. 249 00:10:11,733 --> 00:10:14,860 If you do put in M-A-L-A-N, it's only going to use five, 250 00:10:14,860 --> 00:10:18,120 maybe six bytes for like a trailing null character, 251 00:10:18,120 --> 00:10:23,330 and not spend an additional 249 or 250 bytes unnecessarily. 252 00:10:23,330 --> 00:10:27,380 >> So, this seems like I should have started with this story. 253 00:10:27,380 --> 00:10:29,140 But there's always a tradeoff. 254 00:10:29,140 --> 00:10:33,024 So, on the one hand, a username I've specified to be hard coded at 16, 255 00:10:33,024 --> 00:10:34,940 and maybe that wasn't the right call, maybe it 256 00:10:34,940 --> 00:10:40,040 is, but why not use VARCHARs for everything? 257 00:10:40,040 --> 00:10:42,020 >> It exists for a reason. 258 00:10:42,020 --> 00:10:46,200 Why not use VARCHARs for every field whose length you don't know in advance 259 00:10:46,200 --> 00:10:48,180 if it seems to be a great thing, right? 260 00:10:48,180 --> 00:10:50,482 Use only as much space as you need up to this limit? 261 00:10:50,482 --> 00:10:51,271 >> AUDIENCE: Slower. 262 00:10:51,271 --> 00:10:52,146 DAVID MALAN: Speller? 263 00:10:52,146 --> 00:10:53,120 AUDIENCE: Makes it slower? 264 00:10:53,120 --> 00:10:53,970 DAVID MALAN: Oh, it's slower. 265 00:10:53,970 --> 00:10:55,720 Good, that's almost always the answer, frankly. 266 00:10:55,720 --> 00:10:56,520 Like, what's the tradeoff? 267 00:10:56,520 --> 00:10:58,570 It either costs more space or it costs more time. 268 00:10:58,570 --> 00:11:00,111 So, in this case, it might be slower. 269 00:11:00,111 --> 00:11:00,920 Why? 270 00:11:00,920 --> 00:11:05,830 >> AUDIENCE: [INAUDIBLE] determining [INAUDIBLE]. 271 00:11:05,830 --> 00:11:06,640 >> DAVID MALAN: Good. 272 00:11:06,640 --> 00:11:09,670 So, you might recall from even PSED5, playing with your approach 273 00:11:09,670 --> 00:11:12,750 to the dictionary, if you have to allocate memory dynamically or keep 274 00:11:12,750 --> 00:11:14,630 growing a buffer, that can actually be slow. 275 00:11:14,630 --> 00:11:16,963 If you have to call malloc underneath the hood and maybe 276 00:11:16,963 --> 00:11:19,610 that's what MySQL is doing, so surely that could be the case. 277 00:11:19,610 --> 00:11:22,430 And if you think way back to PSet-- or even 278 00:11:22,430 --> 00:11:26,340 weeks two, when we did things like binary search or even linear search, 279 00:11:26,340 --> 00:11:30,690 one of the nice things about every word in a database or every word in a column 280 00:11:30,690 --> 00:11:33,690 being exactly the same length, even if a whole bunch of those characters 281 00:11:33,690 --> 00:11:37,390 are blank, is that you can use random access on your data, right? 282 00:11:37,390 --> 00:11:40,310 >> If you know that every word is 16 characters away, 283 00:11:40,310 --> 00:11:46,460 you can use pointer arithmetic, so to speak, and go to us 16, 32, 48, 64, 284 00:11:46,460 --> 00:11:49,589 and you can just jump instantly using arithmetic 285 00:11:49,589 --> 00:11:51,130 to any of the words in your database. 286 00:11:51,130 --> 00:11:54,280 Whereas if it's a VARCHAR, what do you instead have to do? 287 00:11:54,280 --> 00:11:55,960 >> [PHONE RINGING] 288 00:11:55,960 --> 00:11:58,680 >> If it's a VARCHAR, you can't use random access. 289 00:11:58,680 --> 00:12:01,341 What you have to look for or do? 290 00:12:01,341 --> 00:12:01,840 Yeah? 291 00:12:01,840 --> 00:12:03,240 >> AUDIENCE: [INAUDIBLE] 292 00:12:03,240 --> 00:12:04,310 >> DAVID MALAN: Look through the whole-- trace 293 00:12:04,310 --> 00:12:06,518 through the whole list looking for what, most likely? 294 00:12:06,518 --> 00:12:08,356 What kind of special value? 295 00:12:08,356 --> 00:12:09,230 AUDIENCE: [INAUDIBLE] 296 00:12:09,230 --> 00:12:11,105 DAVID MALAN: Looking for the null terminators 297 00:12:11,105 --> 00:12:13,637 that demarcate the separation of words. 298 00:12:13,637 --> 00:12:15,720 So again, a tradeoff, and there's no right answer. 299 00:12:15,720 --> 00:12:18,380 But this is where, especially when your users get to be many 300 00:12:18,380 --> 00:12:21,700 and your load on your servers, the number of people using it gets high, 301 00:12:21,700 --> 00:12:23,650 these are actually nontrivial decisions. 302 00:12:23,650 --> 00:12:26,640 So, we can leave these as this, but let's scroll down over to the right 303 00:12:26,640 --> 00:12:27,332 here. 304 00:12:27,332 --> 00:12:30,290 Now, there's a couple of columns where we have to make a judgment call. 305 00:12:30,290 --> 00:12:35,170 Does it make sense to allow a user's name, a user's username or a user's 306 00:12:35,170 --> 00:12:36,370 name, to be null? 307 00:12:36,370 --> 00:12:37,610 That is, just blank. 308 00:12:37,610 --> 00:12:40,360 Feels a little nonsensical, so I'm not going to check those boxes. 309 00:12:40,360 --> 00:12:42,670 But it turns out in a database, you can say, 310 00:12:42,670 --> 00:12:44,620 someone can optionally have this value. 311 00:12:44,620 --> 00:12:47,180 This column does not have to actually be there. 312 00:12:47,180 --> 00:12:48,570 >> Now, there's this drop down menu. 313 00:12:48,570 --> 00:12:50,810 And notice I'm still in the first row there, 314 00:12:50,810 --> 00:12:52,520 so I'm talking about username now. 315 00:12:52,520 --> 00:12:56,290 And it turns out that a database, unlike a simple mere spreadsheet, 316 00:12:56,290 --> 00:12:58,520 has powerful features called indexes. 317 00:12:58,520 --> 00:13:02,600 And an index is a way of telling the database in advance that I the human 318 00:13:02,600 --> 00:13:03,900 am smarter than you. 319 00:13:03,900 --> 00:13:10,430 >> I know what kinds of queries, select or insert or delete or update, 320 00:13:10,430 --> 00:13:13,182 that my code is going to end up doing on this database. 321 00:13:13,182 --> 00:13:14,390 I want to read a lot of data. 322 00:13:14,390 --> 00:13:15,681 I want to insert a lot of data. 323 00:13:15,681 --> 00:13:17,530 I want to constantly delete a lot of data. 324 00:13:17,530 --> 00:13:21,520 If I know that I'm going to be accessing a field like Username a lot, 325 00:13:21,520 --> 00:13:24,770 I can preemptively tell the database, I know more than you, 326 00:13:24,770 --> 00:13:29,220 and I want to decree that you should index this field. 327 00:13:29,220 --> 00:13:33,200 Where indexing a field or a column means that the database in advance 328 00:13:33,200 --> 00:13:37,040 should borrow some ideas from, like, week four and five and six from CS50 329 00:13:37,040 --> 00:13:39,240 and actually build up something like a binary search 330 00:13:39,240 --> 00:13:41,560 tree or something generally called a B tree 331 00:13:41,560 --> 00:13:43,410 that you would learn in a class like CS124 332 00:13:43,410 --> 00:13:46,710 at Harvard, an algorithms class, or any number of other places. 333 00:13:46,710 --> 00:13:49,570 >> The database and the smart people who implemented it 334 00:13:49,570 --> 00:13:53,880 will figure out how to store that table of information 335 00:13:53,880 --> 00:13:57,061 in memory so that searches and other operations are super fast. 336 00:13:57,061 --> 00:13:58,060 You don't have to do it. 337 00:13:58,060 --> 00:14:00,640 You don't have to implement linear search or binary search 338 00:14:00,640 --> 00:14:03,300 or merge sort or selection sort, any of that. 339 00:14:03,300 --> 00:14:06,590 The database does it for you if you tell it preemptively to index this field. 340 00:14:06,590 --> 00:14:09,100 >> And you can see too, there's some other characteristics 341 00:14:09,100 --> 00:14:11,010 we can tell the database to enforce. 342 00:14:11,010 --> 00:14:16,431 What might it mean if I choose Unique from this menu, just intuitively? 343 00:14:16,431 --> 00:14:16,930 Yeah? 344 00:14:16,930 --> 00:14:17,889 >> AUDIENCE: [INAUDIBLE] 345 00:14:17,889 --> 00:14:19,930 DAVID MALAN: Yeah, the username has to be unique. 346 00:14:19,930 --> 00:14:23,330 Is this a good thing or a bad thing for a database, for a website with users? 347 00:14:23,330 --> 00:14:24,965 Should usernames be unique? 348 00:14:24,965 --> 00:14:25,880 Yeah, probably. 349 00:14:25,880 --> 00:14:27,800 If that's what the field we use to log in, 350 00:14:27,800 --> 00:14:31,867 you don't really want to people having the same feel or the same username. 351 00:14:31,867 --> 00:14:33,700 So, we can have the database enforce that so 352 00:14:33,700 --> 00:14:37,880 that now in my PHP code or any language, I don't have to, for instance, check 353 00:14:37,880 --> 00:14:41,490 necessarily does this username exist before I let someone register? 354 00:14:41,490 --> 00:14:46,690 The database won't let two people named David or Malans register in this case. 355 00:14:46,690 --> 00:14:50,030 >> And as an aside, even though this menu only lets you select one, 356 00:14:50,030 --> 00:14:54,550 a unique index is one that's indexed for super fast performance, 357 00:14:54,550 --> 00:14:56,100 but it also enforces uniqueness. 358 00:14:56,100 --> 00:14:58,850 And we'll come back to what the other two mean in just a moment. 359 00:14:58,850 --> 00:15:00,930 Meanwhile, if I go to my second row, which 360 00:15:00,930 --> 00:15:06,230 is the user's name, should I specify that the name should be unique? 361 00:15:06,230 --> 00:15:09,550 No, because you could certainly have-- there's no two David 362 00:15:09,550 --> 00:15:11,050 Malans in this room, most likely. 363 00:15:11,050 --> 00:15:14,290 But if we choose a different name, we might surely have collisions. 364 00:15:14,290 --> 00:15:16,130 >> Think back to hash tables and the like. 365 00:15:16,130 --> 00:15:18,604 So, we certainly don't want to make the name field unique. 366 00:15:18,604 --> 00:15:21,270 So, we're just going to leave that as dash, dash, dash, nothing. 367 00:15:21,270 --> 00:15:22,660 And I'm going to leave everything else alone. 368 00:15:22,660 --> 00:15:25,035 Indeed, most of these fields we won't have to care about. 369 00:15:25,035 --> 00:15:27,830 And when I'm ready to save this, if the internet cooperates, 370 00:15:27,830 --> 00:15:35,032 I click Save, and very, very, very slowly does the database get saved. 371 00:15:35,032 --> 00:15:37,240 And now I'm back to this interface, which admittedly, 372 00:15:37,240 --> 00:15:38,680 is overwhelming at first glance. 373 00:15:38,680 --> 00:15:42,450 But all I'm going to do is click on the word Users at top left. 374 00:15:42,450 --> 00:15:47,630 I'm going to go up here, click Users, and by default, it 375 00:15:47,630 --> 00:15:50,180 has executed some SQL, but more on that in a moment. 376 00:15:50,180 --> 00:15:52,654 Here's just a summary of what I did. 377 00:15:52,654 --> 00:15:55,320 And not to worry that you see mention of Latin and Swedish here. 378 00:15:55,320 --> 00:16:00,490 Those are just the default settings, because MySQL originally, 379 00:16:00,490 --> 00:16:04,000 or PHP MyAdmin, one of the two happened to be written by some Swedish people. 380 00:16:04,000 --> 00:16:06,100 But it's irrelevant in our case here. 381 00:16:06,100 --> 00:16:08,280 >> All right, so why is this all interesting? 382 00:16:08,280 --> 00:16:13,050 It turns out, I can insert data into a database by writing code. 383 00:16:13,050 --> 00:16:15,940 And I'm go ahead and in my file here, I'm 384 00:16:15,940 --> 00:16:19,000 going to go ahead and pretend like this is wired to that database, which 385 00:16:19,000 --> 00:16:23,040 it isn't at the moment, but it will be when we get to problem set seven. 386 00:16:23,040 --> 00:16:25,640 And I'm going to go ahead and execute a function called query, 387 00:16:25,640 --> 00:16:28,730 which we will give you in problem set seven's distribution code, that 388 00:16:28,730 --> 00:16:31,490 takes at least one argument, which is just a string. 389 00:16:31,490 --> 00:16:33,460 A string of SQL code. 390 00:16:33,460 --> 00:16:36,700 So, you're about to learn how to write Structured Query Language. 391 00:16:36,700 --> 00:16:41,270 >> If I want to insert a new row into my database because someone has submitted 392 00:16:41,270 --> 00:16:47,600 a form to my code, I would literally write INSERT INTO users the following 393 00:16:47,600 --> 00:16:52,800 fields: username, comma, name, the VALUES, 394 00:16:52,800 --> 00:16:57,480 and now I need to insert something like Malan, and quote, 395 00:16:57,480 --> 00:17:01,490 unquote 'David Malan.' And now even for those unfamiliar with SQL, 396 00:17:01,490 --> 00:17:07,830 why am I using single quotes inside of this green string? 397 00:17:07,830 --> 00:17:10,790 398 00:17:10,790 --> 00:17:13,040 What might be the reason here? 399 00:17:13,040 --> 00:17:14,609 >> Notice I'm co-mingling two languages. 400 00:17:14,609 --> 00:17:18,099 Query is a PHP function, but it takes an argument. 401 00:17:18,099 --> 00:17:21,740 And that argument has to itself be written in another language called 402 00:17:21,740 --> 00:17:23,500 SQL, Structured Query Language. 403 00:17:23,500 --> 00:17:27,940 So, everything that I have just highlighted here 404 00:17:27,940 --> 00:17:30,380 is this language called SQL. 405 00:17:30,380 --> 00:17:36,290 So, what's with the single quotes, just as a quick sanity check? 406 00:17:36,290 --> 00:17:37,324 Go ahead. 407 00:17:37,324 --> 00:17:37,990 They're strings. 408 00:17:37,990 --> 00:17:41,590 So, quote, unquote Malan and quote, unquote David Malan are strings. 409 00:17:41,590 --> 00:17:45,210 And just thinking intuitively now, knowing what you know about C and PHP, 410 00:17:45,210 --> 00:17:50,220 why did I not do this, which I usually used double quotes for strings? 411 00:17:50,220 --> 00:17:52,310 Why did I not want to do that? 412 00:17:52,310 --> 00:17:52,810 Yeah? 413 00:17:52,810 --> 00:17:53,685 >> AUDIENCE: [INAUDIBLE] 414 00:17:53,685 --> 00:17:56,695 415 00:17:56,695 --> 00:17:57,570 DAVID MALAN: Exactly. 416 00:17:57,570 --> 00:17:59,653 Because I'm already using double quotes on the way 417 00:17:59,653 --> 00:18:01,929 outside of the argument to the PHP function, 418 00:18:01,929 --> 00:18:03,470 I would just confuse the interpreter. 419 00:18:03,470 --> 00:18:04,860 It won't know, do these go together? 420 00:18:04,860 --> 00:18:05,735 Do these go together? 421 00:18:05,735 --> 00:18:06,810 Do these go together? 422 00:18:06,810 --> 00:18:08,070 So, I alternate instead. 423 00:18:08,070 --> 00:18:11,784 >> Or I could do something like this, backslash quote or backslash quote. 424 00:18:11,784 --> 00:18:14,200 Frankly, that just starts to get very unreadable and ugly. 425 00:18:14,200 --> 00:18:16,790 But that would achieve the same result as well. 426 00:18:16,790 --> 00:18:19,760 >> So, if I were to execute this query now, let's see what happens. 427 00:18:19,760 --> 00:18:22,740 I'm going to go ahead now and rather than execute the PHP code, which 428 00:18:22,740 --> 00:18:24,610 is where you'll play in problem set seven, 429 00:18:24,610 --> 00:18:27,200 I'm going to instead go to PHP MyAdmin. 430 00:18:27,200 --> 00:18:29,770 And I'm manually going to go to the SQL tab, 431 00:18:29,770 --> 00:18:31,580 and let me zoom in on the interface. 432 00:18:31,580 --> 00:18:34,007 And I'm going to paste in the thing I just typed. 433 00:18:34,007 --> 00:18:36,090 And the color coding has changed a little bit now, 434 00:18:36,090 --> 00:18:38,750 just because the program formats things a little differently. 435 00:18:38,750 --> 00:18:41,960 But notice that all I've done is I've said, insert into Users. 436 00:18:41,960 --> 00:18:45,790 I've specified, then, in a comma separated parenthesized list the two 437 00:18:45,790 --> 00:18:48,850 fields that I want to insert, and then I've literally said values 438 00:18:48,850 --> 00:18:51,510 followed by another paren, and then the two values 439 00:18:51,510 --> 00:18:53,520 I want to plug-in, and now for good measure, 440 00:18:53,520 --> 00:18:55,010 I'll put a semicolon at the end. 441 00:18:55,010 --> 00:18:56,570 So, this is not C. This is not PHP. 442 00:18:56,570 --> 00:18:59,970 This is now SQL, and I'm pasting it into this web based interface that's 443 00:18:59,970 --> 00:19:02,710 just going to let me, as soon as I click Go, 444 00:19:02,710 --> 00:19:08,060 execute this query on the database running inside of CS50 IDE. 445 00:19:08,060 --> 00:19:09,470 >> So this is good. 446 00:19:09,470 --> 00:19:12,520 Notice that said one row inserted, went super fast, 447 00:19:12,520 --> 00:19:15,190 0.0054 seconds to insert that data. 448 00:19:15,190 --> 00:19:16,610 So, that sounds pretty healthy. 449 00:19:16,610 --> 00:19:19,350 It reformatted my query for me here just to see it 450 00:19:19,350 --> 00:19:21,730 in sort of color coded version. 451 00:19:21,730 --> 00:19:24,540 But now if I click Browse, notice that, even 452 00:19:24,540 --> 00:19:29,070 though there's a lot of clutter on the screen, my table now has two rows. 453 00:19:29,070 --> 00:19:30,700 >> So, let me go ahead and do another. 454 00:19:30,700 --> 00:19:33,760 Instead of this, let me go to the SQL tab again. 455 00:19:33,760 --> 00:19:40,723 And this time I'll insert something like Rob and his name will be Rob Bowden. 456 00:19:40,723 --> 00:19:42,330 Bowden. 457 00:19:42,330 --> 00:19:44,040 Let's click Save. 458 00:19:44,040 --> 00:19:46,140 Oops, rather Go. 459 00:19:46,140 --> 00:19:48,890 >> Click Browse again, and now notice I have two rows. 460 00:19:48,890 --> 00:19:52,390 So, this is just a way more complex way of opening up Google Spreadsheets 461 00:19:52,390 --> 00:19:54,010 and just typing a row into a column. 462 00:19:54,010 --> 00:19:57,070 But what's key is that we now have the syntax 463 00:19:57,070 --> 00:20:00,220 with which to write code so that ultimately, we could actually 464 00:20:00,220 --> 00:20:01,790 do some and this. 465 00:20:01,790 --> 00:20:05,380 Recall that PHP supports super global variables. 466 00:20:05,380 --> 00:20:08,415 >> What is inside of dollar sign underscore GET in PHP? 467 00:20:08,415 --> 00:20:10,290 We took a look at one or two simple examples. 468 00:20:10,290 --> 00:20:15,640 And in PSet6, recall you have hello dot PHP which uses this variable. 469 00:20:15,640 --> 00:20:17,870 What goes in there? 470 00:20:17,870 --> 00:20:21,015 Or what is it? 471 00:20:21,015 --> 00:20:22,522 A little louder. 472 00:20:22,522 --> 00:20:23,790 >> AUDIENCE: [INAUDIBLE] 473 00:20:23,790 --> 00:20:25,030 >> DAVID MALAN: It's a snow seed of array, which 474 00:20:25,030 --> 00:20:27,714 is just a fancy way of saying an array that has key value pairs. 475 00:20:27,714 --> 00:20:28,880 And the keys aren't numeric. 476 00:20:28,880 --> 00:20:30,420 They're words or strings. 477 00:20:30,420 --> 00:20:32,750 And specifically, what are those key value pairs? 478 00:20:32,750 --> 00:20:35,110 Where do they come from? 479 00:20:35,110 --> 00:20:35,620 Sorry? 480 00:20:35,620 --> 00:20:36,994 >> AUDIENCE: [INAUDIBLE] 481 00:20:36,994 --> 00:20:37,660 DAVID MALAN: No? 482 00:20:37,660 --> 00:20:40,700 Where do those key value pairs come from? 483 00:20:40,700 --> 00:20:42,490 Say again? 484 00:20:42,490 --> 00:20:44,610 Again? 485 00:20:44,610 --> 00:20:46,472 Am I the only one hearing something? 486 00:20:46,472 --> 00:20:47,810 >> [LAUGHTER] 487 00:20:47,810 --> 00:20:49,042 >> That's right, yes? 488 00:20:49,042 --> 00:20:50,435 >> AUDIENCE: [INAUDIBLE] 489 00:20:50,435 --> 00:20:52,560 DAVID MALAN: Yeah, they come from the query string. 490 00:20:52,560 --> 00:20:55,380 So, if you rewind in time to when we've played with Google 491 00:20:55,380 --> 00:20:59,600 and we've gone to Google.com slash search question mark q equals cats, 492 00:20:59,600 --> 00:21:03,550 if I were to hit Enter and if Google were implemented in PHP, 493 00:21:03,550 --> 00:21:07,017 PHP code that Google wrote would have access to dollar sign 494 00:21:07,017 --> 00:21:11,600 underscore GET inside of which is a key called Q and a value 495 00:21:11,600 --> 00:21:17,680 called cats that it can then use used to do an actual search with. 496 00:21:17,680 --> 00:21:20,860 >> So, in fact, what I'm going to do now is go back to my PHP code 497 00:21:20,860 --> 00:21:23,140 that you'll again see more of in PSet7. 498 00:21:23,140 --> 00:21:25,440 And instead of plugging in hard coded values which 499 00:21:25,440 --> 00:21:27,630 doesn't seem like a very dynamic website, 500 00:21:27,630 --> 00:21:30,680 I'm going to give you a teaser of what your actual code would do. 501 00:21:30,680 --> 00:21:32,854 You would put in two question marks like this. 502 00:21:32,854 --> 00:21:34,270 I don't know what the username is. 503 00:21:34,270 --> 00:21:37,390 I don't know what the name is going to be, 504 00:21:37,390 --> 00:21:39,470 but I do know I can get them dynamically. 505 00:21:39,470 --> 00:21:43,420 >> So, if the code we're writing now is the code running on Google's servers, 506 00:21:43,420 --> 00:21:46,940 or if this is hello dot PHP, which comes with PSet6, 507 00:21:46,940 --> 00:21:48,650 I'm going to pass into the query function 508 00:21:48,650 --> 00:21:51,450 just like printf, two other arguments. 509 00:21:51,450 --> 00:21:57,120 GET, quote, unquote username, and GET, quote, unquote name. 510 00:21:57,120 --> 00:22:00,720 And now, notice what the general structure is here. 511 00:22:00,720 --> 00:22:03,320 I've got on the left hand side of the call, 512 00:22:03,320 --> 00:22:05,480 this function called query in PHP. 513 00:22:05,480 --> 00:22:08,160 I still have as a first argument, just a string of text. 514 00:22:08,160 --> 00:22:11,000 >> But that string of text is written in a language called SQL. 515 00:22:11,000 --> 00:22:12,616 And frankly, it's not a big language. 516 00:22:12,616 --> 00:22:14,990 We're only going to talk about it formally today, really. 517 00:22:14,990 --> 00:22:17,031 And then in problem set seven, there's relatively 518 00:22:17,031 --> 00:22:18,800 few features that we're going to leverage. 519 00:22:18,800 --> 00:22:22,530 The question marks, though, mean plug in a value here and plug in another value 520 00:22:22,530 --> 00:22:23,130 here. 521 00:22:23,130 --> 00:22:26,010 And notice, I've omitted what from around the quote-- damn 522 00:22:26,010 --> 00:22:30,470 it-- around the quotation marks this time. 523 00:22:30,470 --> 00:22:34,930 I've omitted the quotation marks around the question mark, 524 00:22:34,930 --> 00:22:36,410 sorry, this time around. 525 00:22:36,410 --> 00:22:38,870 >> So, what's nice about this question mark feature which 526 00:22:38,870 --> 00:22:42,830 PHP tends to support, Ruby and Python and other languages, 527 00:22:42,830 --> 00:22:45,730 this just means plug in some value here and you know what? 528 00:22:45,730 --> 00:22:48,300 You figure out whether to use single quotes or double quotes. 529 00:22:48,300 --> 00:22:50,966 Don't bother me with those intellectually uninteresting details. 530 00:22:50,966 --> 00:22:53,780 But, make sure it's correct so that my code is ultimately 531 00:22:53,780 --> 00:22:57,010 operational and safe, which will have a meaning before long. 532 00:22:57,010 --> 00:23:00,460 >> Now, how many arguments total, just to be clear, is the query function taking? 533 00:23:00,460 --> 00:23:05,240 534 00:23:05,240 --> 00:23:07,581 Anyone want to vote for more than two? 535 00:23:07,581 --> 00:23:08,080 Three? 536 00:23:08,080 --> 00:23:10,001 Sure, why? 537 00:23:10,001 --> 00:23:10,920 Why three? 538 00:23:10,920 --> 00:23:12,305 >> AUDIENCE: [INAUDIBLE] 539 00:23:12,305 --> 00:23:13,180 DAVID MALAN: Exactly. 540 00:23:13,180 --> 00:23:14,610 The first part is the string. 541 00:23:14,610 --> 00:23:18,640 The second argument is dollar sign underscore GET bracket username. 542 00:23:18,640 --> 00:23:21,950 And the third argument is the same thing, but just the name. 543 00:23:21,950 --> 00:23:24,590 So in other words, now if I had a web form 544 00:23:24,590 --> 00:23:27,149 that had to text fields, one for the user's username, 545 00:23:27,149 --> 00:23:29,690 one for his or her name, just like you would see in a website 546 00:23:29,690 --> 00:23:32,120 when you register for some website, this might 547 00:23:32,120 --> 00:23:35,450 be the code on the back end that actually does the insertion now 548 00:23:35,450 --> 00:23:37,220 into the database. 549 00:23:37,220 --> 00:23:40,870 >> Now by contrast, let's fast forward. 550 00:23:40,870 --> 00:23:43,840 Suppose a user is now logging in and you want 551 00:23:43,840 --> 00:23:48,860 to write PHP code that checks whether the person who's just logged in 552 00:23:48,860 --> 00:23:52,250 is actually a user, you can use pretty simple syntax. 553 00:23:52,250 --> 00:23:55,832 You can say SELECT, let's say star, where star means everything. 554 00:23:55,832 --> 00:23:57,540 I don't know what I want, so just give me 555 00:23:57,540 --> 00:24:01,585 all the columns from the table called users where, and this is nice. 556 00:24:01,585 --> 00:24:03,710 Select supports what's called a predicate, which is 557 00:24:03,710 --> 00:24:06,630 like a way of qualifying what you want. 558 00:24:06,630 --> 00:24:10,590 Where username equals quote, unquote Malan. 559 00:24:10,590 --> 00:24:13,680 So here too, I've embedded inside the argument 560 00:24:13,680 --> 00:24:16,110 to a PHP function, a line of SQL code. 561 00:24:16,110 --> 00:24:18,680 And that SQL code this time is literally going 562 00:24:18,680 --> 00:24:21,790 to search for quote, unquote Malan. 563 00:24:21,790 --> 00:24:24,420 >> Now that's not all that useful, so I'm going to skip that 564 00:24:24,420 --> 00:24:28,650 and I'm going to put away this tip from Brady, and go 565 00:24:28,650 --> 00:24:30,990 and plug-in instead a question mark here. 566 00:24:30,990 --> 00:24:33,290 So, just to be clear, what should my second argument 567 00:24:33,290 --> 00:24:37,480 be if someone has just logged in and I want to check if he or she is actually 568 00:24:37,480 --> 00:24:39,265 a user? 569 00:24:39,265 --> 00:24:40,140 AUDIENCE: [INAUDIBLE] 570 00:24:40,140 --> 00:24:40,890 DAVID MALAN: Yeah. 571 00:24:40,890 --> 00:24:44,120 I hear dollar sign underscore GET quote, unquote username. 572 00:24:44,120 --> 00:24:50,040 And that should return to me any of the rows in my database 573 00:24:50,040 --> 00:24:51,986 that have a username of Malan. 574 00:24:51,986 --> 00:24:54,860 Now hopefully, I'm going to get back zero if Malan's never been here, 575 00:24:54,860 --> 00:24:56,290 or one if he has. 576 00:24:56,290 --> 00:24:59,026 I shouldn't get back two or three or four. 577 00:24:59,026 --> 00:24:59,526 Why? 578 00:24:59,526 --> 00:25:00,220 >> AUDIENCE: [INAUDIBLE] 579 00:25:00,220 --> 00:25:01,120 >> DAVID MALAN: I said unique, right? 580 00:25:01,120 --> 00:25:01,750 Simple reason. 581 00:25:01,750 --> 00:25:04,030 Because I said it's got to be unique, just logically, 582 00:25:04,030 --> 00:25:07,940 you can only have zero or one Malans in this particular database table. 583 00:25:07,940 --> 00:25:10,965 Now as an aside, just so you've seen it, even though I keep using GET 584 00:25:10,965 --> 00:25:14,350 and even though PSet6 only used GET, you can certainly have POST. 585 00:25:14,350 --> 00:25:17,212 And recall that Post is another technique for submitting information 586 00:25:17,212 --> 00:25:19,170 from a form, but it doesn't show up in the URL. 587 00:25:19,170 --> 00:25:22,690 It's a little more secure certainly for things like usernames and passwords, 588 00:25:22,690 --> 00:25:25,210 which PSet7 will, in fact, involve. 589 00:25:25,210 --> 00:25:28,130 >> So, let's do this in PHP MyAdmin and see what happens. 590 00:25:28,130 --> 00:25:30,020 I'm going to go to MySQL tab. 591 00:25:30,020 --> 00:25:34,450 And notice that the default value for PHP MyAdmin, just to try to be helpful, 592 00:25:34,450 --> 00:25:37,050 is to select star from users where one. 593 00:25:37,050 --> 00:25:39,430 Well, one is always true, so this has the silly effective 594 00:25:39,430 --> 00:25:40,400 of just select everything. 595 00:25:40,400 --> 00:25:42,691 But I'm going to be a little more pedantic and manually 596 00:25:42,691 --> 00:25:45,920 type out SELECT star FROM users. 597 00:25:45,920 --> 00:25:48,294 >> Now technically, you can quote the name of the tables. 598 00:25:48,294 --> 00:25:50,460 It's rare that you have to, but notice these are not 599 00:25:50,460 --> 00:25:52,240 your normal quotes on the US keyboard. 600 00:25:52,240 --> 00:25:54,760 This is the so-called backtick, which is generally on the top left hand 601 00:25:54,760 --> 00:25:56,000 corner of your keyboard. 602 00:25:56,000 --> 00:25:58,500 But it's rare that you'll actually need to bother with that, 603 00:25:58,500 --> 00:25:59,950 so I'll just omit them anyway. 604 00:25:59,950 --> 00:26:02,280 So now, let me go ahead and hit go. 605 00:26:02,280 --> 00:26:06,616 And how many rows should I get back when I select star from users? 606 00:26:06,616 --> 00:26:08,407 >> AUDIENCE: [INAUDIBLE] 607 00:26:08,407 --> 00:26:09,990 DAVID MALAN: The number of rows, sure. 608 00:26:09,990 --> 00:26:12,390 But how many in this concrete story right now? 609 00:26:12,390 --> 00:26:14,640 Two, because there was me and there was Rob. 610 00:26:14,640 --> 00:26:19,370 So, if I click Go, I see visually that I've gotten back, indeed, two rows. 611 00:26:19,370 --> 00:26:22,060 There's a lot of clutter on the screen, but I only see two rows. 612 00:26:22,060 --> 00:26:28,580 By contrast, if I do this again and do SELECT star FROM users, where username 613 00:26:28,580 --> 00:26:31,840 equals quote, unquote Malan, now if I click Go, 614 00:26:31,840 --> 00:26:33,970 I'm only going to get back one row. 615 00:26:33,970 --> 00:26:36,499 And lastly, if I do something like this, suppose 616 00:26:36,499 --> 00:26:38,290 that I don't care about getting everything, 617 00:26:38,290 --> 00:26:41,020 which is kind of meaningless now, because there's only two columns. 618 00:26:41,020 --> 00:26:43,103 It's not like I'm selecting a huge amount of data. 619 00:26:43,103 --> 00:26:46,720 Suppose I go ahead and do SELECT name FROM 620 00:26:46,720 --> 00:26:51,990 users, where username equals Malan, what's nice about SQL honestly, 621 00:26:51,990 --> 00:26:54,290 is that it really just does what you tell it to do. 622 00:26:54,290 --> 00:26:57,550 It's pretty succinct, but you literally just tell it what you want to do. 623 00:26:57,550 --> 00:27:01,130 Select name from users where the username equals Malan. 624 00:27:01,130 --> 00:27:03,440 And it really is that explicit. 625 00:27:03,440 --> 00:27:08,410 So, now if I hit Go, how many rows am I going to get back? 626 00:27:08,410 --> 00:27:10,770 One, because it's just Malan, hopefully. 627 00:27:10,770 --> 00:27:13,100 Or zero if he's not there, but one maximally. 628 00:27:13,100 --> 00:27:17,610 >> And how many columns will I get back? 629 00:27:17,610 --> 00:27:18,450 How many columns? 630 00:27:18,450 --> 00:27:20,658 This time, I'm just going to get one because I didn't 631 00:27:20,658 --> 00:27:22,380 select star, which is everything. 632 00:27:22,380 --> 00:27:27,900 Now I'm selecting just name, so I just get back one column and one row. 633 00:27:27,900 --> 00:27:31,730 And it looks sort of appropriately ridiculous, just looking super 634 00:27:31,730 --> 00:27:33,060 small like this. 635 00:27:33,060 --> 00:27:34,290 So, what's really happening? 636 00:27:34,290 --> 00:27:36,890 When you execute a SQL query using select, 637 00:27:36,890 --> 00:27:38,700 what you're getting back from the database 638 00:27:38,700 --> 00:27:42,970 is like a temporary table with rows and columns, maybe, 639 00:27:42,970 --> 00:27:46,260 but that omit anything that wasn't actually selected by you. 640 00:27:46,260 --> 00:27:49,010 So, it's like if someone had a big spreadsheet of all the students 641 00:27:49,010 --> 00:27:51,610 registered for some student group, and you say, 642 00:27:51,610 --> 00:27:55,097 give me all of the freshman who've registered for our student group, what 643 00:27:55,097 --> 00:27:56,930 your colleague in the student group might do 644 00:27:56,930 --> 00:27:58,430 is they could just hand you the whole spreadsheet. 645 00:27:58,430 --> 00:27:59,742 That's like saying select star. 646 00:27:59,742 --> 00:28:02,200 And it's a little annoying if you only wanted the freshman. 647 00:28:02,200 --> 00:28:05,640 And so, if you instead said, select star from database table 648 00:28:05,640 --> 00:28:08,470 where year equals quote, unquote freshman, 649 00:28:08,470 --> 00:28:10,810 it's as though your friend in the student group 650 00:28:10,810 --> 00:28:13,770 literally highlighted and copied only the freshman rows, 651 00:28:13,770 --> 00:28:16,780 pasted them into a new Google Spreadsheet or an Excel file, 652 00:28:16,780 --> 00:28:18,860 and handed you back the resulting file only. 653 00:28:18,860 --> 00:28:21,710 That's all that's going on conceptually here. 654 00:28:21,710 --> 00:28:23,920 >> So in the end, we can do some pretty fancy things 655 00:28:23,920 --> 00:28:26,560 by storing things like usernames and passwords and the like. 656 00:28:26,560 --> 00:28:30,310 But, it turns out, we should do a little differently than this. 657 00:28:30,310 --> 00:28:34,750 It's not that smart to only store a username and a password. 658 00:28:34,750 --> 00:28:37,790 Someone earlier, I think down here, suggested an ID. 659 00:28:37,790 --> 00:28:40,787 Now an ID could be like a Harvard ID or Yale's Net ID, 660 00:28:40,787 --> 00:28:42,870 but it could be even simpler in our database case. 661 00:28:42,870 --> 00:28:45,120 And indeed, the common case is to have another column. 662 00:28:45,120 --> 00:28:46,953 And I'm going to go ahead and edit my table. 663 00:28:46,953 --> 00:28:49,521 And if you play around with this interface for PSet7, 664 00:28:49,521 --> 00:28:51,770 you'll see that you can check this button here and add 665 00:28:51,770 --> 00:28:53,750 a field at the beginning of the table. 666 00:28:53,750 --> 00:28:56,720 And now if I click Go, it's going to give me one of those forms 667 00:28:56,720 --> 00:28:57,600 from earlier. 668 00:28:57,600 --> 00:29:00,170 I'm going to add a field called ID. 669 00:29:00,170 --> 00:29:03,070 And I'm going to make it a numeric type. 670 00:29:03,070 --> 00:29:05,362 >> I have a whole bunch of values for numerics. 671 00:29:05,362 --> 00:29:08,677 I'm just going to choose an INT and not worry about the disparate sizes. 672 00:29:08,677 --> 00:29:10,510 I don't have to specify a length or a value, 673 00:29:10,510 --> 00:29:13,710 because it's going to be 32 bits no matter what. 674 00:29:13,710 --> 00:29:16,070 Attributes, we didn't see before. 675 00:29:16,070 --> 00:29:18,410 Any interest in any of these menu options this time? 676 00:29:18,410 --> 00:29:21,890 677 00:29:21,890 --> 00:29:23,745 For an INT? 678 00:29:23,745 --> 00:29:24,620 What did you propose? 679 00:29:24,620 --> 00:29:27,350 680 00:29:27,350 --> 00:29:28,445 No? 681 00:29:28,445 --> 00:29:29,570 Do any of these make sense? 682 00:29:29,570 --> 00:29:30,536 Yeah. 683 00:29:30,536 --> 00:29:31,900 Yeah, unsigned, right? 684 00:29:31,900 --> 00:29:35,930 >> Generally, if we're going to give everyone a unique number, which 685 00:29:35,930 --> 00:29:38,200 is where this story is going, I really only want 686 00:29:38,200 --> 00:29:41,919 a person to have the number like zero and one and two and three and four. 687 00:29:41,919 --> 00:29:43,710 I don't need to deal with negative numbers. 688 00:29:43,710 --> 00:29:45,210 It just seems like undue complexity. 689 00:29:45,210 --> 00:29:48,470 I want four billion possible values, not four billion possible values, 690 00:29:48,470 --> 00:29:50,699 so I just doubled the capacity of my INT. 691 00:29:50,699 --> 00:29:53,490 As an aside, if you want to relate this to something like Facebook, 692 00:29:53,490 --> 00:29:56,190 back in sort of my day when Facebook first came out, 693 00:29:56,190 --> 00:29:59,510 I believe what they were using in their MySQL database 694 00:29:59,510 --> 00:30:02,856 to store a user's identifier, was just an INT. 695 00:30:02,856 --> 00:30:05,230 But of course, there's a lot of real people in the world. 696 00:30:05,230 --> 00:30:07,438 There's a lot of fake Facebook accounts in the world. 697 00:30:07,438 --> 00:30:11,701 And so eventually, Facebook overflowed the size of an INT, a four billion 698 00:30:11,701 --> 00:30:12,200 value. 699 00:30:12,200 --> 00:30:15,032 Which is why, if you look around and there's websites 700 00:30:15,032 --> 00:30:16,740 that can tell you what your unique ID is. 701 00:30:16,740 --> 00:30:19,781 And if you never chose a username in Facebook, you'll see your unique ID. 702 00:30:19,781 --> 00:30:23,080 I think it's profile dot PHP question mark ID equals something. 703 00:30:23,080 --> 00:30:27,210 That is now something like a big INT, or a long long if you will, 704 00:30:27,210 --> 00:30:29,700 which is a 64-bit value or something comparable. 705 00:30:29,700 --> 00:30:33,620 >> So, even in the real world do these issues ultimately sometimes matter. 706 00:30:33,620 --> 00:30:37,600 And it turns out here, if I'm giving all of my users a unique ID, 707 00:30:37,600 --> 00:30:41,750 I want to be super explicit and minimally make this field unique. 708 00:30:41,750 --> 00:30:44,750 But it turns out there's one piece of nomenclature today too 709 00:30:44,750 --> 00:30:46,470 that's a primary key. 710 00:30:46,470 --> 00:30:49,800 If you're designing a database table and you know in advance 711 00:30:49,800 --> 00:30:55,580 that one of the columns in that table should and will uniquely identify rows 712 00:30:55,580 --> 00:30:58,500 in the table, you want to specify it and tell the database, 713 00:30:58,500 --> 00:31:00,250 this is my primary key. 714 00:31:00,250 --> 00:31:02,110 There might be duplicates in other fields, 715 00:31:02,110 --> 00:31:06,330 but I'm telling the database that this is my primary, my most important field, 716 00:31:06,330 --> 00:31:08,420 that's guaranteed to be unique. 717 00:31:08,420 --> 00:31:09,660 >> Now, this seems redundant. 718 00:31:09,660 --> 00:31:13,830 I am now proposing that we add, by clicking Save here, 719 00:31:13,830 --> 00:31:17,210 a field called-- and I'm going to go ahead and click AI, 720 00:31:17,210 --> 00:31:19,720 we'll come back to that in a moment, Save. 721 00:31:19,720 --> 00:31:22,540 I am proposing now that my table look like this. 722 00:31:22,540 --> 00:31:26,305 I have an INT field called ID, a CHAR field called Username, 723 00:31:26,305 --> 00:31:31,100 a VARCHAR field called Name, but ID, if it's primary and therefore unique, 724 00:31:31,100 --> 00:31:33,760 why did I just waste time introducing what 725 00:31:33,760 --> 00:31:39,140 effectively is a second unique field called ID that's an INT? 726 00:31:39,140 --> 00:31:41,980 >> Username, recall, was already unique, we said. 727 00:31:41,980 --> 00:31:45,350 So just logically, you don't need any database experience to reason 728 00:31:45,350 --> 00:31:47,570 through this, why might I have introduced 729 00:31:47,570 --> 00:31:50,065 an INT as my unique identifier as well? 730 00:31:50,065 --> 00:31:52,740 731 00:31:52,740 --> 00:31:53,930 What's this-- say again? 732 00:31:53,930 --> 00:31:55,580 >> AUDIENCE: [INAUDIBLE] 733 00:31:55,580 --> 00:31:59,534 >> DAVID MALAN: Random access is easier, why? 734 00:31:59,534 --> 00:32:00,410 >> AUDIENCE: [INAUDIBLE] 735 00:32:00,410 --> 00:32:02,367 DAVID MALAN: Yeah, it's just accessing numbers. 736 00:32:02,367 --> 00:32:04,750 So, if you think of this truly is a table, like an array, 737 00:32:04,750 --> 00:32:07,690 now I have unique identifiers that I can jump around. 738 00:32:07,690 --> 00:32:11,520 And better than that still is that how big is an INT going to be again? 739 00:32:11,520 --> 00:32:14,450 740 00:32:14,450 --> 00:32:15,800 32 bits or four bytes. 741 00:32:15,800 --> 00:32:17,750 >> How big is my username going to be? 742 00:32:17,750 --> 00:32:20,310 743 00:32:20,310 --> 00:32:21,990 Maximally? 744 00:32:21,990 --> 00:32:22,880 16 bytes. 745 00:32:22,880 --> 00:32:26,080 >> So, if you are really caring about the performance of your code, 746 00:32:26,080 --> 00:32:31,390 think back to PSet5, would you prefer to search for a four byte value or a 16 747 00:32:31,390 --> 00:32:32,240 byte value, right? 748 00:32:32,240 --> 00:32:33,810 It really is as simple as that. 749 00:32:33,810 --> 00:32:38,060 You have to do four times as much work to search for usernames because those 750 00:32:38,060 --> 00:32:38,830 are 16 bytes. 751 00:32:38,830 --> 00:32:41,320 So, you have to literally compare all 16 bytes to be 752 00:32:41,320 --> 00:32:43,140 sure yes, this is username I want. 753 00:32:43,140 --> 00:32:46,610 Whereas for an INT, you can do it with just four bytes. 754 00:32:46,610 --> 00:32:49,212 >> And as an aside for those interested in computer hardware, 755 00:32:49,212 --> 00:32:52,420 it turns out you can fit something like an INT or a 32-bit value in something 756 00:32:52,420 --> 00:32:55,330 called a register in a computer CPU, which means it's super, 757 00:32:55,330 --> 00:32:58,400 super fast, even at the lowest level of the computer's hardware. 758 00:32:58,400 --> 00:33:00,530 So, there's just advantages all around. 759 00:33:00,530 --> 00:33:01,530 So, what does this mean? 760 00:33:01,530 --> 00:33:04,850 In fact, when you're designing a database table, almost all of the time 761 00:33:04,850 --> 00:33:07,170 are you going to have not only the data you care about, 762 00:33:07,170 --> 00:33:09,280 but also something like a unique identifier 763 00:33:09,280 --> 00:33:11,280 because this is going to let us do other things. 764 00:33:11,280 --> 00:33:13,160 And let's trip over one problem here. 765 00:33:13,160 --> 00:33:15,990 >> Suppose that users have not just usernames and names, 766 00:33:15,990 --> 00:33:19,540 but they also have things like cities and states and zip codes, at least 767 00:33:19,540 --> 00:33:20,432 here in the US. 768 00:33:20,432 --> 00:33:22,390 So, I'm going to go ahead and just quickly say, 769 00:33:22,390 --> 00:33:26,180 give me three more columns at the end of the table. 770 00:33:26,180 --> 00:33:28,900 And this is going to be City, this is going to be State, 771 00:33:28,900 --> 00:33:30,400 and this is going to be Zip. 772 00:33:30,400 --> 00:33:32,710 >> Now City, what data types should this be, perhaps? 773 00:33:32,710 --> 00:33:35,460 774 00:33:35,460 --> 00:33:35,989 VARCHAR? 775 00:33:35,989 --> 00:33:37,780 I don't know what the longest name city is. 776 00:33:37,780 --> 00:33:40,571 Somewhere in America, there's probably some ridiculously long word, 777 00:33:40,571 --> 00:33:43,605 so let's just go with 255, somewhat historically or arbitrarily. 778 00:33:43,605 --> 00:33:44,730 State, what you want to do? 779 00:33:44,730 --> 00:33:48,380 780 00:33:48,380 --> 00:33:50,367 Judgment call, right? 781 00:33:50,367 --> 00:33:51,700 What's maybe the most efficient? 782 00:33:51,700 --> 00:33:53,500 How many characters? 783 00:33:53,500 --> 00:33:55,950 Maybe just two, if we can get away with doing just, 784 00:33:55,950 --> 00:33:58,250 like, MA for Massachusetts and so forth. 785 00:33:58,250 --> 00:34:00,520 So, I'm going to go a CHAR value of two. 786 00:34:00,520 --> 00:34:03,080 >> Zip code's an interesting one. 787 00:34:03,080 --> 00:34:06,679 We're here in 02138, so that suggests we should use what? 788 00:34:06,679 --> 00:34:07,470 It's an INT, right? 789 00:34:07,470 --> 00:34:10,030 790 00:34:10,030 --> 00:34:12,800 INT, INT, short? 791 00:34:12,800 --> 00:34:14,521 Short would work. 792 00:34:14,521 --> 00:34:15,020 No? 793 00:34:15,020 --> 00:34:18,920 794 00:34:18,920 --> 00:34:20,870 CHAR or five, but I want an INT. 795 00:34:20,870 --> 00:34:23,710 Why push back on an INT? 796 00:34:23,710 --> 00:34:26,820 Persuade me from this. 797 00:34:26,820 --> 00:34:29,210 What's stupid about an INT, my idea? 798 00:34:29,210 --> 00:34:29,871 Yeah. 799 00:34:29,871 --> 00:34:31,199 >> AUDIENCE: Take up more memory. 800 00:34:31,199 --> 00:34:32,909 >> DAVID MALAN: Take up more memory. 801 00:34:32,909 --> 00:34:35,310 Four bytes, but you're proposing a zip code 802 00:34:35,310 --> 00:34:39,000 as five bytes or someone was as a CHAR, which feels like eh, that's not really 803 00:34:39,000 --> 00:34:39,620 the case. 804 00:34:39,620 --> 00:34:40,489 >> Well, fun story. 805 00:34:40,489 --> 00:34:43,179 Years ago, when I used to use Microsoft Outlook for my email, 806 00:34:43,179 --> 00:34:45,150 I eventually wanted to switch to Gmail. 807 00:34:45,150 --> 00:34:48,949 And so, I exported all of my contacts from Outlook as a CSV file. 808 00:34:48,949 --> 00:34:50,699 Comma separated values, which just meant I 809 00:34:50,699 --> 00:34:54,060 had all my friends names and last names and phone numbers and zip codes 810 00:34:54,060 --> 00:34:54,747 and all of that. 811 00:34:54,747 --> 00:34:56,580 And then I made the mistake of opening it up 812 00:34:56,580 --> 00:34:58,640 in Excel, which is a spreadsheet program that 813 00:34:58,640 --> 00:35:00,289 understands CSV files as we've seen. 814 00:35:00,289 --> 00:35:03,080 But then, I must have hit, like, Command or Control S at one point. 815 00:35:03,080 --> 00:35:06,250 And Excel apparently at the time had a feature whereby any time it 816 00:35:06,250 --> 00:35:08,100 saw a number, it tried to be helpful. 817 00:35:08,100 --> 00:35:11,610 And if that number started with zeros, it would just get rid of them. 818 00:35:11,610 --> 00:35:13,420 Why do you need leading zeros on integers? 819 00:35:13,420 --> 00:35:15,140 They're meaningless, mathematically. 820 00:35:15,140 --> 00:35:17,530 They're not meaningless in the US Postal system. 821 00:35:17,530 --> 00:35:19,954 So, I've had for years, to this day, I still 822 00:35:19,954 --> 00:35:22,370 have friends that when the rare case that I need someone's 823 00:35:22,370 --> 00:35:24,078 address these days, I'll still see that I 824 00:35:24,078 --> 00:35:26,767 have a friend in Cambridge, Massachusetts, 2138. 825 00:35:26,767 --> 00:35:29,350 And it's annoying if you're trying to sort of programmatically 826 00:35:29,350 --> 00:35:30,975 generate envelopes or just jot it down. 827 00:35:30,975 --> 00:35:33,599 And that's because of this reason, I chose the wrong data type. 828 00:35:33,599 --> 00:35:34,490 So, I love your idea. 829 00:35:34,490 --> 00:35:35,650 Let's use a CHAR field. 830 00:35:35,650 --> 00:35:38,340 Five characters, except there is a corner case. 831 00:35:38,340 --> 00:35:42,220 If you still send mail, sometimes zip codes these days, 832 00:35:42,220 --> 00:35:45,360 they're, like, plus four. 833 00:35:45,360 --> 00:35:48,200 So, we need a hyphen and then we need four more numbers. 834 00:35:48,200 --> 00:35:50,330 So to be honest, it could go many different ways. 835 00:35:50,330 --> 00:35:52,371 >> For now, I'm going to keep it simple and I'm just 836 00:35:52,371 --> 00:35:54,780 going to say that it's a five CHAR value and we're 837 00:35:54,780 --> 00:35:56,739 going to skip the whole dash plus four. 838 00:35:56,739 --> 00:35:58,280 But these are the kinds of tradeoffs. 839 00:35:58,280 --> 00:36:00,196 And you can think of the same problems arising 840 00:36:00,196 --> 00:36:01,860 with phone numbers or other fields. 841 00:36:01,860 --> 00:36:04,350 >> And now, this is actually a foolish road to go down. 842 00:36:04,350 --> 00:36:08,000 Suppose both Rob and I and Hannah and Maria and [? Davon ?] and Andy 843 00:36:08,000 --> 00:36:12,820 and others on the staff all live in Cambridge, Massachusetts, 02138. 844 00:36:12,820 --> 00:36:17,970 This actually feel stupid that I'm adding to my users table, city, state, 845 00:36:17,970 --> 00:36:18,630 and zip. 846 00:36:18,630 --> 00:36:20,980 Why? 847 00:36:20,980 --> 00:36:21,960 >> AUDIENCE: [INAUDIBLE] 848 00:36:21,960 --> 00:36:22,918 >> DAVID MALAN: Say again? 849 00:36:22,918 --> 00:36:24,310 AUDIENCE: [INAUDIBLE] 850 00:36:24,310 --> 00:36:25,850 >> DAVID MALAN: They're always going to go together, right? 851 00:36:25,850 --> 00:36:28,660 When it turns out, we used to think this was the case until we exhaustively 852 00:36:28,660 --> 00:36:30,570 searched the whole US, and turns out that there 853 00:36:30,570 --> 00:36:32,653 are some inconsistencies where multiple towns have 854 00:36:32,653 --> 00:36:35,060 the same zip, which is weird. 855 00:36:35,060 --> 00:36:40,580 But, if we stipulate for now that 02138 is always Cambridge, Massachusetts, 856 00:36:40,580 --> 00:36:44,910 why in the world would you store in your database Cambridge and MA and 02138 857 00:36:44,910 --> 00:36:49,357 for me and for Hannah and for Rob and for [? Davon ?] and for others who live 858 00:36:49,357 --> 00:36:51,190 here in Cambridge, it's perfectly redundant. 859 00:36:51,190 --> 00:36:54,480 >> We should get away with just storing what? 860 00:36:54,480 --> 00:36:55,610 Just the zip code. 861 00:36:55,610 --> 00:36:58,660 But then, if we store just the zip code, I do want, probably, 862 00:36:58,660 --> 00:37:02,160 for my website to know where 02138 is. 863 00:37:02,160 --> 00:37:03,910 So, I need another table. 864 00:37:03,910 --> 00:37:04,697 And that's OK. 865 00:37:04,697 --> 00:37:07,530 And in fact, this is one of the design processes of designing tables 866 00:37:07,530 --> 00:37:11,472 that you'll do in PSet7 as well whereby you want to factor out common data. 867 00:37:11,472 --> 00:37:14,430 Just like we've been factoring out common code and factoring out common 868 00:37:14,430 --> 00:37:17,380 styles from CSS, here too in the database, 869 00:37:17,380 --> 00:37:21,180 if I only need 02138 to uniquely identify someone's hometown, 870 00:37:21,180 --> 00:37:25,020 don't store Cambridge, Mass for every darn user in your table. 871 00:37:25,020 --> 00:37:29,770 >> Instead, have a separate table called Zips that should have what columns? 872 00:37:29,770 --> 00:37:33,490 Probably an ID field, just because, for the principles we're talking about now. 873 00:37:33,490 --> 00:37:35,720 Probably a zip field for 02138. 874 00:37:35,720 --> 00:37:38,400 And then probably what other columns? 875 00:37:38,400 --> 00:37:42,950 City and state, but only have one row for 02138, one row for 02139, 876 00:37:42,950 --> 00:37:44,772 one row for 90210. 877 00:37:44,772 --> 00:37:46,730 And that is literally all the zip codes I know. 878 00:37:46,730 --> 00:37:49,012 >> So now, what can you do? 879 00:37:49,012 --> 00:37:51,220 This is problematic, because now I've got two tables. 880 00:37:51,220 --> 00:37:54,660 So, my users are mostly over here, but their city state information's 881 00:37:54,660 --> 00:37:55,390 over here. 882 00:37:55,390 --> 00:37:58,635 So, it turns out with SQL, there's actually a way to join information, 883 00:37:58,635 --> 00:38:00,470 and you'll see this in the PSet. 884 00:38:00,470 --> 00:38:03,000 >> But it turns out you can do something like this. 885 00:38:03,000 --> 00:38:10,501 SELECT star FROM users, JOIN zips ON users dot zip equals zips dot zip. 886 00:38:10,501 --> 00:38:13,360 Which is a little wordy, admittedly, but this just 887 00:38:13,360 --> 00:38:17,590 means select everything from the process of taking my users table 888 00:38:17,590 --> 00:38:19,580 and my zips table. 889 00:38:19,580 --> 00:38:22,120 Join them on the one field they have in column. 890 00:38:22,120 --> 00:38:24,780 So, literally doing something like this, and give me back 891 00:38:24,780 --> 00:38:27,360 a new temporary table that's wider, that's bigger, 892 00:38:27,360 --> 00:38:29,450 that has all of the columns from both of them. 893 00:38:29,450 --> 00:38:33,510 And that, quite simply, would be the syntax for doing something like this. 894 00:38:33,510 --> 00:38:35,540 >> So, there's this ahead, but there's going 895 00:38:35,540 --> 00:38:38,950 to be other design decisions you'll have to make, not only with indexes 896 00:38:38,950 --> 00:38:40,550 but also running into challenges. 897 00:38:40,550 --> 00:38:43,360 In fact, there's a challenge in any database design 898 00:38:43,360 --> 00:38:47,930 whereby sometimes two people might want to access the same rows of the database 899 00:38:47,930 --> 00:38:48,530 table. 900 00:38:48,530 --> 00:38:51,450 So, this is something that we'll encounter in PSet7 as well. 901 00:38:51,450 --> 00:38:54,686 >> But I thought I'd look at one attack that's possible in SQL. 902 00:38:54,686 --> 00:38:56,560 What are some of the problems that can arise? 903 00:38:56,560 --> 00:38:58,170 So, you'll encounter this in PSet7. 904 00:38:58,170 --> 00:39:01,874 And we tell you outright what the coding solution for this problem is. 905 00:39:01,874 --> 00:39:04,790 But if you take a higher level class, especially in operating systems, 906 00:39:04,790 --> 00:39:06,950 you're going to encounter an issue of atomicity, 907 00:39:06,950 --> 00:39:10,080 the problem of trying to do multiple things all at once 908 00:39:10,080 --> 00:39:11,000 without interruption. 909 00:39:11,000 --> 00:39:14,560 >> And I thought I'd introduce this idea for PSet7 with a metaphor 910 00:39:14,560 --> 00:39:18,160 that I learned myself in Margo Seltzer's CS164 operating systems 911 00:39:18,160 --> 00:39:18,990 class years ago. 912 00:39:18,990 --> 00:39:22,230 Suppose that you have one of these dorm fridges in your dorm room or house, 913 00:39:22,230 --> 00:39:24,474 and you have a real penchant for milk. 914 00:39:24,474 --> 00:39:27,140 And so, you come home from classes one day, you open the fridge. 915 00:39:27,140 --> 00:39:27,620 Oh, damn it. 916 00:39:27,620 --> 00:39:28,870 There's no milk in the fridge. 917 00:39:28,870 --> 00:39:32,470 So, you close the fridge, lock the door, lock your dorm, 918 00:39:32,470 --> 00:39:34,770 walk around the corner to CVS, get in line, 919 00:39:34,770 --> 00:39:36,312 and start checking out for some milk. 920 00:39:36,312 --> 00:39:38,978 And it's going to take a while, because those damn self checkout 921 00:39:38,978 --> 00:39:40,570 counters take forever to use anyway. 922 00:39:40,570 --> 00:39:41,950 So meanwhile, your roommate comes home. 923 00:39:41,950 --> 00:39:43,470 He or she really likes milk as well. 924 00:39:43,470 --> 00:39:45,520 They come into the dorm room, open the fridge, oh, darn it. 925 00:39:45,520 --> 00:39:46,490 There's no more milk. 926 00:39:46,490 --> 00:39:49,040 >> So, he or she also goes around the corner. 927 00:39:49,040 --> 00:39:51,670 But now, since there's like two or three or four CVSes nearby, 928 00:39:51,670 --> 00:39:53,800 they happen to go to one of the different ones in the square. 929 00:39:53,800 --> 00:39:55,830 And so now, a few minutes later, both of you 930 00:39:55,830 --> 00:39:58,060 come home and ugh, worst problem ever. 931 00:39:58,060 --> 00:40:00,967 Now you have too much milk because it's going to go sour. 932 00:40:00,967 --> 00:40:03,050 And you like milk, but you don't really like milk. 933 00:40:03,050 --> 00:40:06,730 >> So now, this was an expensive mistake because both of you 934 00:40:06,730 --> 00:40:09,870 made a decision based on the state of some variable that 935 00:40:09,870 --> 00:40:12,660 was in the process of being changed by you, 936 00:40:12,660 --> 00:40:14,560 the initiator of going to get milk. 937 00:40:14,560 --> 00:40:17,785 So, what is perhaps a human solution to that problem? 938 00:40:17,785 --> 00:40:18,660 AUDIENCE: [INAUDIBLE] 939 00:40:18,660 --> 00:40:19,430 DAVID MALAN: Leave a note, right? 940 00:40:19,430 --> 00:40:21,850 Always leave a note, if you're familiar with that show. 941 00:40:21,850 --> 00:40:23,100 Yes, there are two of us. 942 00:40:23,100 --> 00:40:25,940 So, always leave a note, or literally lock the refrigerator 943 00:40:25,940 --> 00:40:28,602 with some kind of padlock or something over the top like that. 944 00:40:28,602 --> 00:40:31,310 But that's actually going to be key problem with database design, 945 00:40:31,310 --> 00:40:34,710 especially when you might have multiple browsers, multiple laptops, 946 00:40:34,710 --> 00:40:37,450 multiple users all trying to update information at once. 947 00:40:37,450 --> 00:40:40,590 Particularly sensitive information like financial information, 948 00:40:40,590 --> 00:40:43,350 whereby with a stock trading website like you'll be building, 949 00:40:43,350 --> 00:40:47,270 what if you want to check how much money you have and then if you have enough, 950 00:40:47,270 --> 00:40:48,490 buy some stock? 951 00:40:48,490 --> 00:40:50,899 >> But what if someone else who has a joint account with you 952 00:40:50,899 --> 00:40:52,690 is simultaneously trying to buy some stock? 953 00:40:52,690 --> 00:40:55,190 So, he or she is checking the account balance, both of you 954 00:40:55,190 --> 00:40:57,540 get back the same answer, there's no milk. 955 00:40:57,540 --> 00:41:00,580 Or both of you get back the answer, you have $100 in the account. 956 00:41:00,580 --> 00:41:04,680 Both of you try to make the decision to buy one share of some company stock. 957 00:41:04,680 --> 00:41:06,130 >> And now, what happens? 958 00:41:06,130 --> 00:41:07,140 You have two shares? 959 00:41:07,140 --> 00:41:08,420 You have no shares? 960 00:41:08,420 --> 00:41:10,320 Problems like that can arise. 961 00:41:10,320 --> 00:41:11,755 So, we'll encounter that. 962 00:41:11,755 --> 00:41:14,630 SQL injection attacks, thankfully, are something we'll help you with, 963 00:41:14,630 --> 00:41:17,430 but these are atrociously common these days still. 964 00:41:17,430 --> 00:41:18,680 So, this is just an example. 965 00:41:18,680 --> 00:41:21,290 I make no claims that Harvard PIN system is 966 00:41:21,290 --> 00:41:23,130 vulnerable to this particular attack. 967 00:41:23,130 --> 00:41:24,160 We've tried. 968 00:41:24,160 --> 00:41:26,120 But, you know that we have a field like this. 969 00:41:26,120 --> 00:41:29,620 And Yale's Net ID has a similar looking screen these days. 970 00:41:29,620 --> 00:41:33,190 And it turns out, that maybe the PIN system is implemented in PHP. 971 00:41:33,190 --> 00:41:37,050 >> And if it were-- it's not-- they might have code that looks like this. 972 00:41:37,050 --> 00:41:38,210 They have two variables. 973 00:41:38,210 --> 00:41:42,495 Give me the username and password from the post super global variable 974 00:41:42,495 --> 00:41:43,970 that we talked about earlier. 975 00:41:43,970 --> 00:41:47,310 Maybe Harvard has a query like SELECT star FROM users 976 00:41:47,310 --> 00:41:50,005 where username equals that and password equals that. 977 00:41:50,005 --> 00:41:51,880 And notice that I'm just plugging it in using 978 00:41:51,880 --> 00:41:55,050 the curly brace notation from the other day, which means just plug in a value 979 00:41:55,050 --> 00:41:55,550 here. 980 00:41:55,550 --> 00:41:57,449 I'm not using the question mark technique. 981 00:41:57,449 --> 00:41:59,240 I don't have any second or third arguments. 982 00:41:59,240 --> 00:42:02,350 I'm just literally constructing the string myself. 983 00:42:02,350 --> 00:42:04,930 >> The problem, though, is that if someone like a scroob, 984 00:42:04,930 --> 00:42:09,020 which is a reference to a film, logs in with something like this, 985 00:42:09,020 --> 00:42:11,250 and I've removed the dots that usually cover up 986 00:42:11,250 --> 00:42:14,370 passwords, what if he is particularly malicious 987 00:42:14,370 --> 00:42:18,860 and his password maybe is 12345, per the movie called "Spaceballs," 988 00:42:18,860 --> 00:42:21,970 but he critically types a single quote after the five, 989 00:42:21,970 --> 00:42:24,790 then literally the word or in the space, and then quote, 990 00:42:24,790 --> 00:42:29,160 unquote one equals quote one, but notice he's omitted what? 991 00:42:29,160 --> 00:42:32,700 He's omitted the quote on the right and he's omitted the quote on the left. 992 00:42:32,700 --> 00:42:35,170 >> Because if this attacker scroob's presumption 993 00:42:35,170 --> 00:42:38,160 is that the people who wrote this PHP code weren't so bright, 994 00:42:38,160 --> 00:42:42,990 maybe they just have some single quotes around the interpolation 995 00:42:42,990 --> 00:42:45,210 of a variable in curly braces? 996 00:42:45,210 --> 00:42:48,620 And so maybe, he could kind of complete their thought 997 00:42:48,620 --> 00:42:53,290 for them, but in a way that's going to let him hacked into the PIN system. 998 00:42:53,290 --> 00:42:55,310 In other words, suppose that this is the code 999 00:42:55,310 --> 00:42:57,140 and we now plug in what scroob typed. 1000 00:42:57,140 --> 00:42:58,770 And it's red, because it's bad. 1001 00:42:58,770 --> 00:43:01,310 >> And the underlying text is what he typed in, 1002 00:43:01,310 --> 00:43:05,510 scroob could trick Harvard's server into constructing a SQL query 1003 00:43:05,510 --> 00:43:07,440 string that looks like this. 1004 00:43:07,440 --> 00:43:11,760 Password equals 12345 or one equals one. 1005 00:43:11,760 --> 00:43:14,820 The result of which, logically, is that this will log scroob 1006 00:43:14,820 --> 00:43:18,360 in if his password is 12345 or if one equals 1007 00:43:18,360 --> 00:43:22,660 one, which is of course always true, which means scroob always gets in. 1008 00:43:22,660 --> 00:43:26,060 >> And so, the way to fix this, as in a lot of cases, 1009 00:43:26,060 --> 00:43:28,140 would be to write more defensively. 1010 00:43:28,140 --> 00:43:30,390 To use something like our actual query function, which 1011 00:43:30,390 --> 00:43:33,980 you'll see in PSet7, where we plug in something like question marks here. 1012 00:43:33,980 --> 00:43:35,980 And the beauty of the query function that we 1013 00:43:35,980 --> 00:43:40,010 give you is it defends against these so-called SQL injection attacks, where 1014 00:43:40,010 --> 00:43:44,260 someone is tricking your code into injecting his or her own SQL code. 1015 00:43:44,260 --> 00:43:47,380 Because what the query function we give you will actually do, 1016 00:43:47,380 --> 00:43:51,270 if you use the question mark syntax and a second and a third argument here, 1017 00:43:51,270 --> 00:43:54,590 is what did it add to the input that the user provided? 1018 00:43:54,590 --> 00:43:56,060 Those backslash quotes. 1019 00:43:56,060 --> 00:43:58,590 >> So, it escapes any potentially dangerous characters. 1020 00:43:58,590 --> 00:44:01,000 This looks weird now, but it's not vulnerable 1021 00:44:01,000 --> 00:44:03,260 because it doesn't change the logic anymore 1022 00:44:03,260 --> 00:44:06,470 because that whole password is now a single quote that's not, 1023 00:44:06,470 --> 00:44:07,596 in fact, scroob's password. 1024 00:44:07,596 --> 00:44:09,845 So, there's been some jokes about this over the years. 1025 00:44:09,845 --> 00:44:12,570 So, this was a photo taken of some geek in a parking lot 1026 00:44:12,570 --> 00:44:16,620 whereby you might know that some cities and states try to scan your license 1027 00:44:16,620 --> 00:44:19,460 plate to bill you or to ticket you if you go through without, like, 1028 00:44:19,460 --> 00:44:20,660 the E-Z Pass thing. 1029 00:44:20,660 --> 00:44:24,490 So, this person presumed that maybe the people writing the E-Z Pass system 1030 00:44:24,490 --> 00:44:28,240 were not so bright, and maybe they just concatenated together a string, 1031 00:44:28,240 --> 00:44:32,190 so that he or she couldn't maliciously not just complete their thought, 1032 00:44:32,190 --> 00:44:35,150 but actually execute a bad command, which we've not mentioned yet, 1033 00:44:35,150 --> 00:44:36,380 but you can probably guess. 1034 00:44:36,380 --> 00:44:39,820 That in addition to delete and insert and update and select, 1035 00:44:39,820 --> 00:44:43,370 there's also a keyword called drop, which literally deletes everything 1036 00:44:43,370 --> 00:44:45,300 in the database, which is particularly bad. 1037 00:44:45,300 --> 00:44:48,760 >> We can zoom in on this if it's a little tough to see. 1038 00:44:48,760 --> 00:44:52,300 This, now, is a famous cartoon that's wonderfully clever now 1039 00:44:52,300 --> 00:44:53,145 and understandable. 1040 00:44:53,145 --> 00:45:00,880 1041 00:45:00,880 --> 00:45:04,750 >> [LAUGHTER] 1042 00:45:04,750 --> 00:45:05,910 >> Yeah, cool. 1043 00:45:05,910 --> 00:45:06,800 Kind of geeking out. 1044 00:45:06,800 --> 00:45:08,800 So these, then, are SQL injection attacks. 1045 00:45:08,800 --> 00:45:13,050 And they're so easy to avoid by using the right code or the right libraries. 1046 00:45:13,050 --> 00:45:15,947 And you'll see in PSet7, that's why we give you the query function. 1047 00:45:15,947 --> 00:45:17,780 So, a couple of teasers that we thought we'd 1048 00:45:17,780 --> 00:45:19,930 give you here in our remaining minutes together. 1049 00:45:19,930 --> 00:45:24,030 So, as you remember from week zero, we introduced these two light bulbs which 1050 00:45:24,030 --> 00:45:26,610 are nice, not just because they're pretty and are colorful, 1051 00:45:26,610 --> 00:45:29,450 but because they support something called an API, an Application 1052 00:45:29,450 --> 00:45:31,980 Programming Interface And in CS50 thus far, we've 1053 00:45:31,980 --> 00:45:34,440 mostly focused on GET and POST, but it turns out 1054 00:45:34,440 --> 00:45:37,390 there's other HTTP verbs like PUT. 1055 00:45:37,390 --> 00:45:39,430 >> And in fact, this was a slide from week zero 1056 00:45:39,430 --> 00:45:44,930 whereby if you write code that sends a la PSet6 an HTTP request that 1057 00:45:44,930 --> 00:45:49,647 looks like this with this chunk of text at the bottom, which is called JSON, 1058 00:45:49,647 --> 00:45:52,230 or JavaScript Object Notation that we'll talk about next week, 1059 00:45:52,230 --> 00:45:57,030 you can turn on or turn off or change the color of lights like those. 1060 00:45:57,030 --> 00:46:00,480 So if CS50 also has in addition to some of those light bulbs here in New Haven 1061 00:46:00,480 --> 00:46:02,480 if you'd like to borrow them for final projects, 1062 00:46:02,480 --> 00:46:04,370 also some Microsoft Bands, which are like 1063 00:46:04,370 --> 00:46:07,619 watches that you wear around your wrist that similarly have an API so that you 1064 00:46:07,619 --> 00:46:10,040 can write your own software for them. 1065 00:46:10,040 --> 00:46:12,490 >> We have an account with Apple's iOS code so 1066 00:46:12,490 --> 00:46:15,510 that if you have an Apple Watch or an iPhone or an iPad or an iPod, 1067 00:46:15,510 --> 00:46:17,707 you can write code that actually runs on those. 1068 00:46:17,707 --> 00:46:19,540 We have a whole bunch of Arduinos, which are 1069 00:46:19,540 --> 00:46:22,010 tiny little computers without cases, essentially, 1070 00:46:22,010 --> 00:46:25,240 that you can connect via USB, typically to your own Mac or PC, 1071 00:46:25,240 --> 00:46:28,810 write code that runs on these physical devices that often have sensors on them 1072 00:46:28,810 --> 00:46:30,790 so you can interact with the real world. 1073 00:46:30,790 --> 00:46:32,860 We have a whole bunch of Leap Motion devices, 1074 00:46:32,860 --> 00:46:36,500 which are USB devices for Macs and PCs, here and again, in New Haven. 1075 00:46:36,500 --> 00:46:40,080 And if you connect it to your Mac, you can actually control your computer 1076 00:46:40,080 --> 00:46:42,550 by writing software that via infrared beams, 1077 00:46:42,550 --> 00:46:46,360 figures out where your human hands are, even without touching your keyboard. 1078 00:46:46,360 --> 00:46:49,135 We thought we'd share a quick glimpse at this, for instance. 1079 00:46:49,135 --> 00:46:51,428 >> [MUSIC PLAYING] 1080 00:46:51,428 --> 00:47:55,840 1081 00:47:55,840 --> 00:47:57,590 So, we have a whole bunch of these things, 1082 00:47:57,590 --> 00:48:01,040 too, called Myo arm bands which you put over your forearm 1083 00:48:01,040 --> 00:48:04,595 and then you can control the real world or the virtual world like this. 1084 00:48:04,595 --> 00:48:06,471 >> [MUSIC PLAYING] 1085 00:48:06,471 --> 00:49:17,580 1086 00:49:17,580 --> 00:49:20,920 Or, we also have some Google Cardboard, which is literally, like, 1087 00:49:20,920 --> 00:49:24,841 a cardboard box you could put on your face, but slide in your phone into it 1088 00:49:24,841 --> 00:49:27,590 so that you put the glass of your phone really close to your eyes. 1089 00:49:27,590 --> 00:49:30,190 And Google Cardboard is pretty cheap at $10 or $20. 1090 00:49:30,190 --> 00:49:32,230 And it has little lenses that slightly off shift 1091 00:49:32,230 --> 00:49:35,900 the image on the screen for your human eyes to give you a sense of depth 1092 00:49:35,900 --> 00:49:39,550 so that you actually have a 3D environment in front of you. 1093 00:49:39,550 --> 00:49:42,927 We also have some Samsung Gear, which is the more expensive version of this, 1094 00:49:42,927 --> 00:49:46,010 but that can similarly slide in an Android phone and give you the illusion 1095 00:49:46,010 --> 00:49:48,309 of-- or give the experience of virtual reality . 1096 00:49:48,309 --> 00:49:50,850 And in our final two minutes, we thought we'd try to do this. 1097 00:49:50,850 --> 00:49:55,250 If I can project what Colton has here just to whet your appetite, 1098 00:49:55,250 --> 00:49:58,442 let me go ahead and throw up on the big screen here. 1099 00:49:58,442 --> 00:49:59,400 Let me kill the lights. 1100 00:49:59,400 --> 00:50:02,290 Colton, do you want to go ahead and put on your cell for a moment 1101 00:50:02,290 --> 00:50:05,171 and come on over to the middle of the stage? 1102 00:50:05,171 --> 00:50:07,420 And do you want to project-- this is what Colton sees. 1103 00:50:07,420 --> 00:50:10,560 >> Now, the Wi-Fi in here is not so strong for this device 1104 00:50:10,560 --> 00:50:13,870 that this is super compelling, but Colton is literally 1105 00:50:13,870 --> 00:50:15,710 in this magical futuristic place. 1106 00:50:15,710 --> 00:50:16,796 He only sees one image. 1107 00:50:16,796 --> 00:50:19,920 You are seeing his left and right eye that his brain are stitching together 1108 00:50:19,920 --> 00:50:22,260 in a three dimensional environment on his face. 1109 00:50:22,260 --> 00:50:24,319 He's just selected a menu option here. 1110 00:50:24,319 --> 00:50:27,360 And so again, he's wearing this headset with a Samsung phone on it that's 1111 00:50:27,360 --> 00:50:29,080 wirelessly projecting to our overhead. 1112 00:50:29,080 --> 00:50:30,349 Now you're on Mars, I think? 1113 00:50:30,349 --> 00:50:31,140 COLTON: I think so. 1114 00:50:31,140 --> 00:50:32,181 I'm not sure [INAUDIBLE]. 1115 00:50:32,181 --> 00:50:34,250 [LAUGHTER] 1116 00:50:34,250 --> 00:50:36,374 >> DAVID MALAN: Turns out Mars has these menus. 1117 00:50:36,374 --> 00:50:41,590 >> COLTON: [INAUDIBLE] some cool places if we want to go to-- 1118 00:50:41,590 --> 00:50:43,330 >> DAVID MALAN: Where do we want to go? 1119 00:50:43,330 --> 00:50:45,837 >> COLTON: [INAUDIBLE] 1120 00:50:45,837 --> 00:50:48,170 DAVID MALAN: And let's see where Colton's taking us now. 1121 00:50:48,170 --> 00:50:48,961 COLTON: [INAUDIBLE] 1122 00:50:48,961 --> 00:50:52,830 1123 00:50:52,830 --> 00:50:56,380 >> DAVID MALAN: So, there's so many different places you can take yourself. 1124 00:50:56,380 --> 00:51:00,590 There's FAPIs via which you can write games or interactions that 1125 00:51:00,590 --> 00:51:01,950 run, ultimately, on the phone. 1126 00:51:01,950 --> 00:51:03,908 So, you really just writing a mobile phone app. 1127 00:51:03,908 --> 00:51:06,380 But thanks to the software and the graphics capabilities, 1128 00:51:06,380 --> 00:51:08,765 now Colton is in this tiny little cottage. 1129 00:51:08,765 --> 00:51:10,515 And at the risk of overwhelming ourselves, 1130 00:51:10,515 --> 00:51:13,330 Colton and I'll stick around for while at the end of class here today 1131 00:51:13,330 --> 00:51:14,300 if you'd like to come up and play. 1132 00:51:14,300 --> 00:51:16,350 And we'll bring them back next week as well. 1133 00:51:16,350 --> 00:51:18,420 Without further, ado that's it for today. 1134 00:51:18,420 --> 00:51:21,990 We'll see you next week. 1135 00:51:21,990 --> 00:51:24,140 >> [MUSIC - RAGGA TWINS, "BAD MAN"] 1136 00:51:24,140 --> 00:55:23,146