1 00:00:00,000 --> 00:00:03,479 [MUSIC PLAYING] 2 00:00:03,479 --> 00:00:15,829 3 00:00:15,829 --> 00:00:16,620 SPEAKER: All right. 4 00:00:16,620 --> 00:00:19,950 This is CS50, and this is week nine. 5 00:00:19,950 --> 00:00:24,232 You'll recall that last time, we took a look back at such things as Scratch. 6 00:00:24,232 --> 00:00:26,190 Because recall that when we introduced Scratch, 7 00:00:26,190 --> 00:00:27,981 we introduced a whole number of programming 8 00:00:27,981 --> 00:00:31,280 constructs-- loops and conditions and functions and variables and more. 9 00:00:31,280 --> 00:00:34,650 And then just one week later did we transition from that world of Scratch 10 00:00:34,650 --> 00:00:38,680 to C, where the syntax was much more cryptic looking, certainly 11 00:00:38,680 --> 00:00:40,960 at first glance, and perhaps a little bit still. 12 00:00:40,960 --> 00:00:43,040 But the ideas were ultimately the same. 13 00:00:43,040 --> 00:00:46,580 And now in week nine as we transition from the world of C to Python, 14 00:00:46,580 --> 00:00:51,670 you'll find that that same finding is the case, whereby 15 00:00:51,670 --> 00:00:53,630 we are using the same ideas. 16 00:00:53,630 --> 00:00:55,110 We're leveraging the same concepts. 17 00:00:55,110 --> 00:00:57,880 But we have to translate it now to a slightly different domain 18 00:00:57,880 --> 00:00:59,694 and a slightly different syntax. 19 00:00:59,694 --> 00:01:01,860 But things really start to get interesting this week 20 00:01:01,860 --> 00:01:04,879 and beyond, especially as we build on our ability 21 00:01:04,879 --> 00:01:07,750 to write Python code, our ability to serve up web applications. 22 00:01:07,750 --> 00:01:11,950 Because now we can begin to leverage so much more functionality 23 00:01:11,950 --> 00:01:14,620 than comes with C, than comes with the CS50 library, 24 00:01:14,620 --> 00:01:17,420 because there's such a large community of software developers 25 00:01:17,420 --> 00:01:20,890 who have created some really amazing things that we can try out. 26 00:01:20,890 --> 00:01:24,030 So let's look further today at Python, and let's look further 27 00:01:24,030 --> 00:01:26,750 at the applications for a language like this. 28 00:01:26,750 --> 00:01:30,610 You'll recall that perhaps the simplest program we could have written last time 29 00:01:30,610 --> 00:01:33,240 was actually just one line-- print Hello World. 30 00:01:33,240 --> 00:01:36,390 But you'll recall as you began to dive into some of this past week's 31 00:01:36,390 --> 00:01:39,110 challenges, you might have needed or wanted 32 00:01:39,110 --> 00:01:42,750 to actually start wrapping code like that in a method or function like main, 33 00:01:42,750 --> 00:01:45,930 and then calling it by default with this magical incantation here. 34 00:01:45,930 --> 00:01:48,620 But this was the building block toward which we were starting 35 00:01:48,620 --> 00:01:51,350 to develop more interesting programs. 36 00:01:51,350 --> 00:01:55,640 But now we're going to really context switch from a command line environment 37 00:01:55,640 --> 00:01:57,770 over to a web-based environment. 38 00:01:57,770 --> 00:02:01,250 And the world has been writing web-based applications for quite some time. 39 00:02:01,250 --> 00:02:05,281 Even I 20 years ago recall made that Frosh IMs-- freshman intramural sports 40 00:02:05,281 --> 00:02:05,780 website. 41 00:02:05,780 --> 00:02:09,090 And even since then have the languages changed and the paradigms changed, 42 00:02:09,090 --> 00:02:14,210 and we humans have learned a lot about programming for web-based applications. 43 00:02:14,210 --> 00:02:20,440 And this, for instance, is one design pattern or one architecture 44 00:02:20,440 --> 00:02:23,410 that has arisen, whereby MVC refers again 45 00:02:23,410 --> 00:02:27,540 to this pattern, whereby you put your intelligence, your so-called business 46 00:02:27,540 --> 00:02:28,810 logic in your controllers. 47 00:02:28,810 --> 00:02:31,061 This is one or more files that has all the conditions, 48 00:02:31,061 --> 00:02:33,601 a lot of the functionality calls, and actually does something 49 00:02:33,601 --> 00:02:34,510 with your program. 50 00:02:34,510 --> 00:02:38,740 Then you have the views, which are often templates or files that 51 00:02:38,740 --> 00:02:41,930 render information that you might have dynamically generated 52 00:02:41,930 --> 00:02:43,330 or input from users. 53 00:02:43,330 --> 00:02:44,390 And then model. 54 00:02:44,390 --> 00:02:48,930 And today, we start to dwell on the M in MVC, model, because last week, 55 00:02:48,930 --> 00:02:51,350 we didn't really leverage much, if any, of a model. 56 00:02:51,350 --> 00:02:54,670 But this week we're finally going to demand of ourselves the ability 57 00:02:54,670 --> 00:02:58,730 to save data, retrieve data, search data, delete data, and more. 58 00:02:58,730 --> 00:03:03,190 And we really haven't had this capability, other than very simple CSV 59 00:03:03,190 --> 00:03:05,170 files, for instance back in the day of C, 60 00:03:05,170 --> 00:03:08,390 and even last week when we dabbled with those in Python. 61 00:03:08,390 --> 00:03:10,880 And you'll recall, last time we introduced this. 62 00:03:10,880 --> 00:03:13,020 Flask is what's called a micro framework. 63 00:03:13,020 --> 00:03:15,990 So a bunch of files, a bunch of code that a community of people 64 00:03:15,990 --> 00:03:19,180 have written that just make it easier to make web-based applications. 65 00:03:19,180 --> 00:03:20,820 It's absolutely not required. 66 00:03:20,820 --> 00:03:24,600 In fact, one of the sample programs among last week's distribution code 67 00:03:24,600 --> 00:03:27,870 if you like to go back and play was a program called serve.py 68 00:03:27,870 --> 00:03:29,080 that doesn't use any of this. 69 00:03:29,080 --> 00:03:32,290 It just uses built-in Python functionality. 70 00:03:32,290 --> 00:03:34,580 But you'll find that it's pretty cryptic. 71 00:03:34,580 --> 00:03:37,680 It's pretty heavyweight in order just to do something simple. 72 00:03:37,680 --> 00:03:40,880 And so things like Flask have come around that just make it easy 73 00:03:40,880 --> 00:03:43,890 and dare say more pleasurable to write web-based applications, 74 00:03:43,890 --> 00:03:47,980 because people have realized in writing web app after web app after web app 75 00:03:47,980 --> 00:03:50,804 that they're just repeating themselves, or borrowing code 76 00:03:50,804 --> 00:03:51,970 they've written in the past. 77 00:03:51,970 --> 00:03:56,240 And so in frameworks, you have solutions typically to very common problems. 78 00:03:56,240 --> 00:03:58,980 So we'll use this set of solutions to the development 79 00:03:58,980 --> 00:04:00,540 of a web-based application. 80 00:04:00,540 --> 00:04:03,110 And for instance, the simplest Flask application 81 00:04:03,110 --> 00:04:05,280 that's also available from last week's source code 82 00:04:05,280 --> 00:04:07,850 might be something like this, whereby the top of your file, 83 00:04:07,850 --> 00:04:11,690 you don't import like the CS50 library, you instead import someone else's 84 00:04:11,690 --> 00:04:15,670 library, specifically from the Flask framework import-- what's 85 00:04:15,670 --> 00:04:18,899 called the class, recall, called Flask-- capital F. 86 00:04:18,899 --> 00:04:22,730 And then also a function or method called Render Template. 87 00:04:22,730 --> 00:04:24,560 And we used both of those as follows. 88 00:04:24,560 --> 00:04:28,030 Last week, we instantiated a Flask-based application 89 00:04:28,030 --> 00:04:30,030 by essentially passing in this special reference 90 00:04:30,030 --> 00:04:32,780 to the current file, the name of the current file, 91 00:04:32,780 --> 00:04:34,780 and then allowing the framework to do its thing 92 00:04:34,780 --> 00:04:38,889 and give us back this very special, very powerful object called app-- 93 00:04:38,889 --> 00:04:41,180 though we could have called it anything-- that gives us 94 00:04:41,180 --> 00:04:42,910 access to some useful functionality. 95 00:04:42,910 --> 00:04:45,290 For instance, the most useful functionality 96 00:04:45,290 --> 00:04:47,510 initially was just this-- @app. 97 00:04:47,510 --> 00:04:50,550 And this is just syntax for what's called a decorator, and more 98 00:04:50,550 --> 00:04:53,610 and that some other time, or more on that in Flask's own documentation. 99 00:04:53,610 --> 00:04:55,960 But essentially, this line here-- @app.route, 100 00:04:55,960 --> 00:05:02,410 says that hey server, any time you see an HTTP request for slash, the default 101 00:05:02,410 --> 00:05:06,994 web page typically of a website, go ahead and call the following function 102 00:05:06,994 --> 00:05:08,160 that's immediately below it. 103 00:05:08,160 --> 00:05:10,532 That function I've called Index, mostly by convention. 104 00:05:10,532 --> 00:05:12,240 But I could've called it anything I want. 105 00:05:12,240 --> 00:05:16,040 And all it did last week in this example was render a template. 106 00:05:16,040 --> 00:05:18,240 In this case, index.html. 107 00:05:18,240 --> 00:05:20,710 Now, that could be raw HTML recall. 108 00:05:20,710 --> 00:05:24,060 But oftentimes, you use something called a templating language. 109 00:05:24,060 --> 00:05:27,110 And indeed, we introduced a little bit of Ginga last time, which 110 00:05:27,110 --> 00:05:31,200 is a Python-based templating language that we'll see just 111 00:05:31,200 --> 00:05:34,440 makes it easier to generate HTML without having 112 00:05:34,440 --> 00:05:38,030 to write HTML inside of our actual Python 113 00:05:38,030 --> 00:05:40,280 code, which tends to be frowned upon. 114 00:05:40,280 --> 00:05:44,010 So let's take a look back at one of those examples, which 115 00:05:44,010 --> 00:05:47,470 I've renamed from Frosh IMs last week to Frosh IMs zero. 116 00:05:47,470 --> 00:05:49,640 And recall that we had the following files. 117 00:05:49,640 --> 00:05:51,960 We had a templates directory, inside of which 118 00:05:51,960 --> 00:05:54,850 was failure, index, layout, and success. 119 00:05:54,850 --> 00:05:57,677 Kind of a lot of complexity for a pretty simple program, 120 00:05:57,677 --> 00:05:59,510 but we'll see what each of those does again. 121 00:05:59,510 --> 00:06:00,590 And then application. 122 00:06:00,590 --> 00:06:04,320 And this was the so-called controller code that actually did something 123 00:06:04,320 --> 00:06:05,610 interesting last week. 124 00:06:05,610 --> 00:06:06,540 Now, what was that? 125 00:06:06,540 --> 00:06:08,050 Well, it's a pretty small program. 126 00:06:08,050 --> 00:06:10,840 As before, I've imported from the Flask framework 127 00:06:10,840 --> 00:06:12,590 a whole bunch of symbols here. 128 00:06:12,590 --> 00:06:14,430 I'm instantiating my application here. 129 00:06:14,430 --> 00:06:16,580 This is copy-paste from our simplest of examples, 130 00:06:16,580 --> 00:06:21,330 whereby if the user just visits Slash, I want to show him or her index.html. 131 00:06:21,330 --> 00:06:23,040 But then it got interesting. 132 00:06:23,040 --> 00:06:25,990 This really was the first time we had the capability 133 00:06:25,990 --> 00:06:30,550 in a web-based environment to respond dynamically to user's inputs 134 00:06:30,550 --> 00:06:33,070 based on whatever they typed into a web form. 135 00:06:33,070 --> 00:06:35,210 In fact, if you think back a few weeks in week six 136 00:06:35,210 --> 00:06:38,610 when we first introduced the web and HTTP and TCP/IP 137 00:06:38,610 --> 00:06:42,510 and making HTML-based web pages, you recall that all we did 138 00:06:42,510 --> 00:06:45,510 was implement Google's front end, and an ugly one at that. 139 00:06:45,510 --> 00:06:48,195 But just the HTML form, that if you click the Submit button, 140 00:06:48,195 --> 00:06:52,030 it submits to Slash search on google.com, 141 00:06:52,030 --> 00:06:54,740 because we pretty much deferred completely to them, 142 00:06:54,740 --> 00:06:57,110 lacking at the time a backend and lacking at the time 143 00:06:57,110 --> 00:07:00,220 even a language in which we could implement our own backend, 144 00:07:00,220 --> 00:07:02,940 a web server that actually responds to those requests. 145 00:07:02,940 --> 00:07:06,190 But in Frosh IMs Zero, we have the ability 146 00:07:06,190 --> 00:07:08,440 to have our own route, in this case called 147 00:07:08,440 --> 00:07:11,190 Slash Register that I've specified isn't even 148 00:07:11,190 --> 00:07:15,670 going to respond to HTTP Get requests, but rather Post requests, which 149 00:07:15,670 --> 00:07:18,570 typically mean a form submission is sending one or more fields 150 00:07:18,570 --> 00:07:22,510 that you don't necessarily want to end up in the browser's URL or history. 151 00:07:22,510 --> 00:07:25,580 I'm calling my function Register, and this would be a good convention. 152 00:07:25,580 --> 00:07:30,000 Just make sure your function here lines up with what the route is there. 153 00:07:30,000 --> 00:07:32,580 And then I'm checking a couple of conditions. 154 00:07:32,580 --> 00:07:37,160 If that request's form-- that is, all of the parameters 155 00:07:37,160 --> 00:07:41,630 that were submitted via HTTP Post-- has a name field-- 156 00:07:41,630 --> 00:07:43,700 so a text field, for instance, called Name-- 157 00:07:43,700 --> 00:07:46,920 and that equals nothing-- quote unquote, the so-called empty string-- 158 00:07:46,920 --> 00:07:52,070 or, that request's form's dorm key has a value of quote unquote, which 159 00:07:52,070 --> 00:07:57,400 is to say that if the user, myself last week, did not give my name or my dorm, 160 00:07:57,400 --> 00:08:01,512 then go ahead and return the template called Failure.html. 161 00:08:01,512 --> 00:08:03,520 And we'll take a look back at that in a moment. 162 00:08:03,520 --> 00:08:08,110 Otherwise, render template Success.html if all in fact goes well. 163 00:08:08,110 --> 00:08:12,430 Now, if we take a look at Failure.html, it didn't do all that much. 164 00:08:12,430 --> 00:08:14,460 It extended Layout.html. 165 00:08:14,460 --> 00:08:16,320 It declared a title of Failure. 166 00:08:16,320 --> 00:08:19,560 And then it declared a body of You must provide your name and dorm 167 00:08:19,560 --> 00:08:21,810 as sort of an admonishment to the user. 168 00:08:21,810 --> 00:08:24,160 If I look at Success, meanwhile, it's pretty similar, 169 00:08:24,160 --> 00:08:25,520 but the text is different. 170 00:08:25,520 --> 00:08:28,020 But this is the problem we address this week. 171 00:08:28,020 --> 00:08:29,520 You are registered, well not really. 172 00:08:29,520 --> 00:08:32,770 Because recall-- and you could have seen a moment ago-- we did nothing 173 00:08:32,770 --> 00:08:35,570 with the user's name or dorm or any other information, 174 00:08:35,570 --> 00:08:38,100 we just pretended to actually register them. 175 00:08:38,100 --> 00:08:40,230 And then what did the form itself look like? 176 00:08:40,230 --> 00:08:42,900 Well, this page here is mostly HTML. 177 00:08:42,900 --> 00:08:45,790 But again, notice that even this page at the top 178 00:08:45,790 --> 00:08:51,180 defines a body block and a title block, because it's extending Layout.html. 179 00:08:51,180 --> 00:08:52,930 So this is the Ginga stuff that I referred 180 00:08:52,930 --> 00:08:55,200 to earlier, the templating language. 181 00:08:55,200 --> 00:08:58,400 And if we finally go into Layout.html, now you 182 00:08:58,400 --> 00:09:02,700 see the basic framework for every page in this web-based application. 183 00:09:02,700 --> 00:09:05,110 It's a pretty small application to be fair, but it does 184 00:09:05,110 --> 00:09:09,210 have at least three distinct pages-- Index, Failure, Success-- all of which 185 00:09:09,210 --> 00:09:13,790 are identical to the file, except that they each have a title and a custom 186 00:09:13,790 --> 00:09:18,100 body dynamically embedded thanks to how we're using templates. 187 00:09:18,100 --> 00:09:21,120 So this is a nice way of not having to copy and paste all of that code 188 00:09:21,120 --> 00:09:24,260 into every file, thereby making it a pain to update anything, 189 00:09:24,260 --> 00:09:27,019 to add CSS or JavaScript files, or generally any 190 00:09:27,019 --> 00:09:28,560 of the overall structure of the page. 191 00:09:28,560 --> 00:09:32,450 We can factor that all out to Layout.html. 192 00:09:32,450 --> 00:09:35,240 Of course we just threw all this information away. 193 00:09:35,240 --> 00:09:38,470 that name, that dorm, every student who's been registering by this app, 194 00:09:38,470 --> 00:09:39,550 we're forgetting about. 195 00:09:39,550 --> 00:09:44,200 And so at the tail end of last week, we introduced this solution recall, 196 00:09:44,200 --> 00:09:48,180 whereby in Frosh IMs One now, I'll call it-- 197 00:09:48,180 --> 00:09:51,810 or rather today let's redress that by borrowing an idea from last week 198 00:09:51,810 --> 00:09:55,190 that we didn't incorporate into Frosh IMs as follows. 199 00:09:55,190 --> 00:10:00,990 In Application.py for this next version called Frosh IMs One, 200 00:10:00,990 --> 00:10:03,760 what do I seem to be doing differently? 201 00:10:03,760 --> 00:10:06,090 File's almost identical to before. 202 00:10:06,090 --> 00:10:09,120 It's a little longer this route, because what I've grabbed 203 00:10:09,120 --> 00:10:13,200 is some of the code from last time, whereby we wrote out to a CSV 204 00:10:13,200 --> 00:10:17,160 file-- Comma Separated Value-- those student structures. 205 00:10:17,160 --> 00:10:20,960 We did this way back when in C. We then ported the student struct to a student 206 00:10:20,960 --> 00:10:24,010 class last week, and now I've borrowed that same code 207 00:10:24,010 --> 00:10:28,180 for saving those students to disk, so to speak, by embedding it into this route. 208 00:10:28,180 --> 00:10:29,170 So what's going on? 209 00:10:29,170 --> 00:10:31,880 So if again the name or the dorm are blank, 210 00:10:31,880 --> 00:10:35,900 I go ahead and just return the template called Failure.html and we bail out. 211 00:10:35,900 --> 00:10:38,500 Otherwise, I declare a variable called File. 212 00:10:38,500 --> 00:10:43,890 I open a la C's fopen function, registrants.csv, 213 00:10:43,890 --> 00:10:46,260 which I'm just calling this just because it's 214 00:10:46,260 --> 00:10:49,960 going to contain my registrants in a comma separated values file. 215 00:10:49,960 --> 00:10:51,510 And then quote unquote a. 216 00:10:51,510 --> 00:10:54,930 And you might not recall this or might not have seen this, but quote unquote a 217 00:10:54,930 --> 00:10:56,110 is for appending. 218 00:10:56,110 --> 00:10:59,420 If I instead and accidentally did w for writing, 219 00:10:59,420 --> 00:11:00,790 you might think that's correct. 220 00:11:00,790 --> 00:11:04,650 But every time you use quote unquote w, you're going to write out a new file. 221 00:11:04,650 --> 00:11:06,860 Which is to say, if 100 students registered 222 00:11:06,860 --> 00:11:10,780 for some intramural sport via this web app, and I was using quote unquote w, 223 00:11:10,780 --> 00:11:13,110 I would actually keep clobbering or overwriting 224 00:11:13,110 --> 00:11:16,417 the file, such that only the most recently registered student 225 00:11:16,417 --> 00:11:17,500 would appear in this file. 226 00:11:17,500 --> 00:11:19,480 So we want quote unquote a for append. 227 00:11:19,480 --> 00:11:24,480 Then I declare a variable called writer, and I'm using this CSV module 228 00:11:24,480 --> 00:11:26,810 that we get for free from Python that's going 229 00:11:26,810 --> 00:11:31,080 to allow me to create what we'll a writer, passing in that file as input. 230 00:11:31,080 --> 00:11:34,380 Then I'm going to call this special method Write Row, whose purpose in life 231 00:11:34,380 --> 00:11:36,310 is to take a tuple, which in this case has 232 00:11:36,310 --> 00:11:40,687 two elements, the name that was passed in via the form as well as the dorm. 233 00:11:40,687 --> 00:11:42,770 And then I'm going to go ahead and close the file. 234 00:11:42,770 --> 00:11:44,980 So Write Row is now the function that takes 235 00:11:44,980 --> 00:11:49,070 care of all the complexity-- it's not all that much complexity-- of printing 236 00:11:49,070 --> 00:11:52,130 out name, comma, dorm, name, comma, dorm. 237 00:11:52,130 --> 00:11:57,150 And just in case anything has a quote in it, like a someone O'Leary, 238 00:11:57,150 --> 00:12:00,970 or a name that has an apostrophe or some other punctuation symbol that 239 00:12:00,970 --> 00:12:07,100 might otherwise confuse a program reading this text file, 240 00:12:07,100 --> 00:12:10,145 which itself might contain quotes, this kind of function Write Row 241 00:12:10,145 --> 00:12:12,270 will take care of those kind of details, as well as 242 00:12:12,270 --> 00:12:16,060 commas that might correctly or incorrectly be in the values like name 243 00:12:16,060 --> 00:12:17,810 and dorm that I'm providing. 244 00:12:17,810 --> 00:12:20,570 And then we print out the success page. 245 00:12:20,570 --> 00:12:24,490 But this isn't all that useful of a file format. 246 00:12:24,490 --> 00:12:26,810 A CSV file is nice in that you can download it, 247 00:12:26,810 --> 00:12:29,940 you can open it in Excel or Apple Numbers or similar programs. 248 00:12:29,940 --> 00:12:31,950 You can import it into Google Spreadsheets. 249 00:12:31,950 --> 00:12:34,200 But it's not really a database. 250 00:12:34,200 --> 00:12:37,489 You can store data in it, but if you want to read any of that data, 251 00:12:37,489 --> 00:12:39,280 or change any of that data, you pretty much 252 00:12:39,280 --> 00:12:42,850 have to do what we've been doing in C, which is open it with Open or fopen 253 00:12:42,850 --> 00:12:47,040 or whatever, iterate over the lines, maybe parse them or read them 254 00:12:47,040 --> 00:12:50,410 into separate variables or into an array or a list or whatever, 255 00:12:50,410 --> 00:12:52,820 and then you can use binary search or linear search 256 00:12:52,820 --> 00:12:54,820 or whatever you want to actually find data, 257 00:12:54,820 --> 00:12:57,630 maybe to then change data, and then save it all back out. 258 00:12:57,630 --> 00:12:59,420 But this is just tedious. 259 00:12:59,420 --> 00:13:02,450 To have to do all that work simply to save data 260 00:13:02,450 --> 00:13:06,920 isn't all that much fun for programming, and it also doesn't scale very well. 261 00:13:06,920 --> 00:13:10,550 As soon as you have some good success with some web-based application or even 262 00:13:10,550 --> 00:13:13,210 some mobile application, it'd be nice if your code were 263 00:13:13,210 --> 00:13:16,070 as efficient and as fast as possible. 264 00:13:16,070 --> 00:13:17,970 And wouldn't it be nice if we could stand 265 00:13:17,970 --> 00:13:21,230 on the shoulders of others who have had similar problems of storing data 266 00:13:21,230 --> 00:13:25,250 efficiently so that we could learn from them as well and leverage 267 00:13:25,250 --> 00:13:26,710 some of their work? 268 00:13:26,710 --> 00:13:31,890 So thus was born SQL-- Structured Query Language-- 269 00:13:31,890 --> 00:13:34,400 which can be used in any number of environments. 270 00:13:34,400 --> 00:13:37,610 So SQL is just a language that we will now introduce. 271 00:13:37,610 --> 00:13:39,710 And it's a programming language, though it's not 272 00:13:39,710 --> 00:13:43,390 going to be as-- we're not going to use it as richly as we have C or Python. 273 00:13:43,390 --> 00:13:46,390 We're going to use it for a number of fairly basic, but nonetheless very 274 00:13:46,390 --> 00:13:47,800 powerful operations. 275 00:13:47,800 --> 00:13:53,610 And you can store data using this world of SQL in any number of ways. 276 00:13:53,610 --> 00:13:56,430 You can use things like MySQL-- which is a very popular database 277 00:13:56,430 --> 00:13:59,914 server that Facebook started with and still uses for some of its purposes-- 278 00:13:59,914 --> 00:14:04,469 PostgreSQL, Microsoft Access, or Oracle, or any number of third party products 279 00:14:04,469 --> 00:14:06,260 that you might have heard of that are super 280 00:14:06,260 --> 00:14:09,180 popular for web-based and business applications. 281 00:14:09,180 --> 00:14:12,380 And then there's also a format that's even simpler 282 00:14:12,380 --> 00:14:15,570 but gives us all of the same capabilities called SQLite. 283 00:14:15,570 --> 00:14:19,660 Whereas MySQL and Postgres and Oracle and Microsoft Access and the like 284 00:14:19,660 --> 00:14:23,800 typically require that you run some special program, a server, that's 285 00:14:23,800 --> 00:14:25,981 listening for requests and responding to requests 286 00:14:25,981 --> 00:14:28,480 and often has usernames and passwords, if you want something 287 00:14:28,480 --> 00:14:31,880 simpler because you're making a fairly small scale website that might only 288 00:14:31,880 --> 00:14:34,350 have hundreds or thousands or tens of thousands of users, 289 00:14:34,350 --> 00:14:38,270 and you're OK with all of your data living on the same server 290 00:14:38,270 --> 00:14:40,500 that your code lives on, maybe in the same folder, 291 00:14:40,500 --> 00:14:44,770 you can use something called SQLite, which allows us to use the language 292 00:14:44,770 --> 00:14:46,400 SQL that we're about to see, 293 00:14:46,400 --> 00:14:48,760 but it doesn't require the complexity of configuring 294 00:14:48,760 --> 00:14:51,440 a whole server or a whole set of tools. 295 00:14:51,440 --> 00:14:57,390 You can just store your data right there in a .db or .sqlite or whatever file 296 00:14:57,390 --> 00:14:58,980 right in your same directory. 297 00:14:58,980 --> 00:15:00,736 But how are we going to store that data? 298 00:15:00,736 --> 00:15:03,610 Well, many of you are probably familiar with tools like this-- Google 299 00:15:03,610 --> 00:15:07,380 Spreadsheets or Excel or Numbers, which allow you to store data 300 00:15:07,380 --> 00:15:09,161 in generally rows and columns. 301 00:15:09,161 --> 00:15:11,160 And if you're looking at a spreadsheet like this 302 00:15:11,160 --> 00:15:14,510 and you're storing people's names and dorms and email addresses and phone 303 00:15:14,510 --> 00:15:17,900 numbers, generally we humans use the top row for headers, 304 00:15:17,900 --> 00:15:20,960 and we'll put quote unquote name, quote unquote dorm, quote unquote email 305 00:15:20,960 --> 00:15:21,920 or phone or whatever. 306 00:15:21,920 --> 00:15:27,550 And then every row below that first row represents in this case a student, 307 00:15:27,550 --> 00:15:29,594 a record in our spreadsheet. 308 00:15:29,594 --> 00:15:32,510 So if I were to do this in reality-- let me go ahead and do just that. 309 00:15:32,510 --> 00:15:37,337 Name and dorm and maybe email and phone, and any number of other fields. 310 00:15:37,337 --> 00:15:39,420 And I'm going to go ahead and really be tidy here. 311 00:15:39,420 --> 00:15:41,503 And you can do these silly aesthetics in something 312 00:15:41,503 --> 00:15:44,710 like Google Spreadsheets and Excel and Numbers. 313 00:15:44,710 --> 00:15:49,830 But here's where I would do something like David and Matthews and Rob 314 00:15:49,830 --> 00:15:54,140 and Thayer, and dot dot dot, and we can fill in the rest of these rows, 315 00:15:54,140 --> 00:15:56,390 and we can just keep growing and growing and growing. 316 00:15:56,390 --> 00:15:58,190 And frankly, most of these programs are smart enough 317 00:15:58,190 --> 00:16:00,189 that even though I think Google Spreadsheets now 318 00:16:00,189 --> 00:16:03,197 goes to a default of 1,000, when you export that file or save it, 319 00:16:03,197 --> 00:16:05,530 they're only going to save rows presumably that actually 320 00:16:05,530 --> 00:16:07,187 have data up until that point. 321 00:16:07,187 --> 00:16:09,020 And then you can have any number of columns. 322 00:16:09,020 --> 00:16:12,830 You get A through I think Z probably by default. So 26 columns 323 00:16:12,830 --> 00:16:15,660 even if you need fewer, but you can create more as well. 324 00:16:15,660 --> 00:16:16,640 So this is nice. 325 00:16:16,640 --> 00:16:19,380 This itself is structured data. 326 00:16:19,380 --> 00:16:23,550 You have meta data, like name, dorm, email, phone, keys, if you will. 327 00:16:23,550 --> 00:16:26,330 And we've used that expression before in various CS contexts. 328 00:16:26,330 --> 00:16:30,200 And then you have values, and those values line up with those keys. 329 00:16:30,200 --> 00:16:31,950 And in fact, now if you think back to some 330 00:16:31,950 --> 00:16:33,990 of the features we've seen already in Python, 331 00:16:33,990 --> 00:16:38,670 a dictionary is a very popular and very useful data structure. 332 00:16:38,670 --> 00:16:41,830 In Python, a list or an array is another such one. 333 00:16:41,830 --> 00:16:45,700 And so really, if you think about what a spreadsheet is, it's kind of really 334 00:16:45,700 --> 00:16:50,790 just a list of rows, each of which is a dictionary. 335 00:16:50,790 --> 00:16:52,230 And what do I mean by that? 336 00:16:52,230 --> 00:16:57,840 Well, within every row, whether it's two or three or 1,000, you have columns. 337 00:16:57,840 --> 00:17:00,710 And those columns have keys or names to them. 338 00:17:00,710 --> 00:17:02,500 Name, dorm, email, phone. 339 00:17:02,500 --> 00:17:05,170 And recall that a dict or dictionary in Python 340 00:17:05,170 --> 00:17:07,450 is just a collection of key value pairs. 341 00:17:07,450 --> 00:17:11,390 So if this is just a list of values, and each of those values 342 00:17:11,390 --> 00:17:15,859 is organized horizontally essentially as dictionaries, key value pairs, 343 00:17:15,859 --> 00:17:22,290 where row three's name is Rob, row three's dorm is Thayer, and so forth. 344 00:17:22,290 --> 00:17:24,560 Well it seems that we could map this kind of format 345 00:17:24,560 --> 00:17:27,359 very cleanly to something like Python. 346 00:17:27,359 --> 00:17:30,590 Or C for that matter, but Python would be a lot more convenient. 347 00:17:30,590 --> 00:17:32,820 And sometimes, one sheet is not enough. 348 00:17:32,820 --> 00:17:35,590 So sometimes you might have gone down here to the bottom corner. 349 00:17:35,590 --> 00:17:37,590 You might have created something like sheet two. 350 00:17:37,590 --> 00:17:41,400 And maybe here, instead of just storing students, you might store professor. 351 00:17:41,400 --> 00:17:44,862 So the professor's name and office and course 352 00:17:44,862 --> 00:17:46,820 and any number of other pieces of data that you 353 00:17:46,820 --> 00:17:48,250 want to store about him or her. 354 00:17:48,250 --> 00:17:50,080 And we could store professors. 355 00:17:50,080 --> 00:17:53,860 And I'll even rename this from sheet two to Professors. 356 00:17:53,860 --> 00:17:57,260 And in sheet one, I'm going to go here and rename this to Students. 357 00:17:57,260 --> 00:18:00,530 And so we can really kind of organize our data cleanly. 358 00:18:00,530 --> 00:18:01,552 But that's about it. 359 00:18:01,552 --> 00:18:03,260 Even though Google Spreadsheets and Excel 360 00:18:03,260 --> 00:18:05,093 do have some programming functionality built 361 00:18:05,093 --> 00:18:07,890 in with macros or similar functions, it's 362 00:18:07,890 --> 00:18:10,750 not all that easy to query the data. 363 00:18:10,750 --> 00:18:13,140 It's not all that easy to integrate this kind of file 364 00:18:13,140 --> 00:18:14,860 into a program you're writing. 365 00:18:14,860 --> 00:18:17,770 And it's definitely not necessarily efficient. 366 00:18:17,770 --> 00:18:21,220 Because you've not really told Google Spreadsheets anything 367 00:18:21,220 --> 00:18:23,780 about how much data you're going to be putting in here, 368 00:18:23,780 --> 00:18:27,110 what the type of that data is beyond maybe the formatting thereof. 369 00:18:27,110 --> 00:18:29,590 And so there are some opportunities for better design. 370 00:18:29,590 --> 00:18:33,190 And thus came along SQL databases that give us 371 00:18:33,190 --> 00:18:36,460 not only the ability to store things in rows and columns, 372 00:18:36,460 --> 00:18:41,430 but also the ability to create data or create structures in memory, 373 00:18:41,430 --> 00:18:46,160 and insert data, select data, and update data, and delete data. 374 00:18:46,160 --> 00:18:49,540 In fact, a SQL database is really a specific instance 375 00:18:49,540 --> 00:18:52,530 of what's generally known as a relational database, a database that 376 00:18:52,530 --> 00:18:55,600 allows you to maintain relations among pieces of data, 377 00:18:55,600 --> 00:18:58,740 generally spanning sheets-- or as we're going to start calling them, 378 00:18:58,740 --> 00:19:02,270 tables-- that adheres to a silly acronym, 379 00:19:02,270 --> 00:19:07,100 CRUD, the ability to Create, Read, Update and Delete data. 380 00:19:07,100 --> 00:19:11,465 And those verbs map to such keywords or commands as these 381 00:19:11,465 --> 00:19:12,990 as we're about to see. 382 00:19:12,990 --> 00:19:15,400 So what does this actually mean? 383 00:19:15,400 --> 00:19:18,890 Well, let me go ahead and do this. 384 00:19:18,890 --> 00:19:22,450 Let me go ahead and within CS50 IDE, I'm going 385 00:19:22,450 --> 00:19:25,250 to open up a special web-based program that's 386 00:19:25,250 --> 00:19:30,850 simply going to give me the ability to create databases and create fields 387 00:19:30,850 --> 00:19:31,620 therein. 388 00:19:31,620 --> 00:19:35,780 I'm going to go ahead and create a database called Lecture.db. 389 00:19:35,780 --> 00:19:38,770 And I'm going to go ahead and open now a web-based tool via which 390 00:19:38,770 --> 00:19:41,482 I can administer this database called Lecture. 391 00:19:41,482 --> 00:19:43,940 And you're going to see that I have a tab called Structure, 392 00:19:43,940 --> 00:19:46,148 which is where I'm going to be able to define sheets, 393 00:19:46,148 --> 00:19:47,370 or tables as we'll call them. 394 00:19:47,370 --> 00:19:50,495 I can execute raw SQL commands, with which we'll now start to get familiar. 395 00:19:50,495 --> 00:19:53,530 And later on, I can even export or import data as well. 396 00:19:53,530 --> 00:19:55,700 But let's focus on these fields down here. 397 00:19:55,700 --> 00:19:57,930 So this tool is one of any number of dozens 398 00:19:57,930 --> 00:20:02,510 of tools that might exist that allow you to create and administer databases 399 00:20:02,510 --> 00:20:03,520 using SQL. 400 00:20:03,520 --> 00:20:06,550 I'm going to go ahead here and call my table Registrants. 401 00:20:06,550 --> 00:20:09,410 So this is like creating a new sheet in Google Spreadsheets. 402 00:20:09,410 --> 00:20:11,600 And the number fields-- for now, I'm just 403 00:20:11,600 --> 00:20:14,430 going to go ahead and have two, Name and Dorm. 404 00:20:14,430 --> 00:20:17,480 And now notice I get a little HTML form here, whose purpose in life 405 00:20:17,480 --> 00:20:19,500 is to make things a little easier for me. 406 00:20:19,500 --> 00:20:23,320 So I'm going to type in Name for one field, Dorm for another. 407 00:20:23,320 --> 00:20:26,350 And you'll notice that SQL databases now allow 408 00:20:26,350 --> 00:20:28,800 me to choose any number of data types. 409 00:20:28,800 --> 00:20:30,790 So we're sort of going back and forth here 410 00:20:30,790 --> 00:20:34,780 in between the world of strongly typed languages, weakly typed or loosely 411 00:20:34,780 --> 00:20:35,790 typed languages. 412 00:20:35,790 --> 00:20:37,830 Here it can actually matter for performance, 413 00:20:37,830 --> 00:20:40,580 and I'm going to indeed tell my database what kind of data 414 00:20:40,580 --> 00:20:44,470 I'm generally trying to store in its columns and each of its rows. 415 00:20:44,470 --> 00:20:48,120 So my options with SQLite, which is one incarnation of the SQL language, 416 00:20:48,120 --> 00:20:50,235 is Integer-- which means what it says-- Real-- 417 00:20:50,235 --> 00:20:52,110 which is like a real or floating point number 418 00:20:52,110 --> 00:20:56,530 as we've seen that generally can store up to 15 or so digits of precision. 419 00:20:56,530 --> 00:21:00,620 But even then, if it can't, it's going to use a text field automatically 420 00:21:00,620 --> 00:21:04,030 instead, because text is really just a string, otherwise known 421 00:21:04,030 --> 00:21:07,770 in some environments as var char or variable length characters. 422 00:21:07,770 --> 00:21:09,747 Blob is just going to be binary data. 423 00:21:09,747 --> 00:21:12,580 And this means we can store things that are just zeros and ones that 424 00:21:12,580 --> 00:21:14,470 aren't ASCII or Unicode text. 425 00:21:14,470 --> 00:21:17,120 Numeric is a little more flexible. 426 00:21:17,120 --> 00:21:20,310 If I want to store a dollar amount with dollars and cents. 427 00:21:20,310 --> 00:21:22,750 If I want to store maybe an integer, maybe a real number, 428 00:21:22,750 --> 00:21:27,270 I can use numeric, and just let the database-- SQLite in this case-- figure 429 00:21:27,270 --> 00:21:31,129 out what actual data type or affinity, so to speak, to use. 430 00:21:31,129 --> 00:21:32,920 Booleans is just going to be a zero or one. 431 00:21:32,920 --> 00:21:34,461 But that's really just a convenience. 432 00:21:34,461 --> 00:21:38,850 It's actually still going to use a full byte or 8 bits just to store a zero 433 00:21:38,850 --> 00:21:41,614 or to store a one, really just an integer. 434 00:21:41,614 --> 00:21:44,780 And then date/time, which underneath the hood can use any number of formats, 435 00:21:44,780 --> 00:21:46,300 whether it's text or integers or reals. 436 00:21:46,300 --> 00:21:47,690 I'll defer to the documentation. 437 00:21:47,690 --> 00:21:50,550 But that's going to allow me to in a standard way 438 00:21:50,550 --> 00:21:53,700 store things like dates and times so that we can actually 439 00:21:53,700 --> 00:21:56,621 record when stuff is happening, when someone 440 00:21:56,621 --> 00:21:58,370 registered for freshman intramural sports, 441 00:21:58,370 --> 00:22:02,320 and even do useful arithmetic operations on dates and times. 442 00:22:02,320 --> 00:22:04,560 So what should Name be? 443 00:22:04,560 --> 00:22:08,550 Given all that, there aren't really too many options. 444 00:22:08,550 --> 00:22:10,950 Indeed, my instincts are to go with text. 445 00:22:10,950 --> 00:22:13,035 So I'm going to do that for Name and for Dorm. 446 00:22:13,035 --> 00:22:16,160 And then I'm being prompted by this web form for a few different questions. 447 00:22:16,160 --> 00:22:17,610 So primary key. 448 00:22:17,610 --> 00:22:20,620 It turns out that among the features of a SQL database 449 00:22:20,620 --> 00:22:25,790 is the ability to specify that this field-- this column 450 00:22:25,790 --> 00:22:31,570 shall be my primary key, the field whose values uniquely identify 451 00:22:31,570 --> 00:22:34,460 all of the rows in my database. 452 00:22:34,460 --> 00:22:37,790 Now, it's not going to be relevant just yet for the following reason. 453 00:22:37,790 --> 00:22:41,280 Right now, it's quite possible that two Davids live in Matthews, 454 00:22:41,280 --> 00:22:43,370 or two Robs live in Thayer. 455 00:22:43,370 --> 00:22:47,000 And so if I said that Name or Dorm were my primary key, 456 00:22:47,000 --> 00:22:50,630 that would mean I can have only one David or one Rob or one Matthews 457 00:22:50,630 --> 00:22:52,090 or one Thayer in my database. 458 00:22:52,090 --> 00:22:53,923 So for now, we're going to leave this alone. 459 00:22:53,923 --> 00:22:55,630 That would seem to be a bad thing to do. 460 00:22:55,630 --> 00:22:58,560 Auto increment isn't going to be applicable, because you can't auto 461 00:22:58,560 --> 00:23:00,320 incremented a string or a text field. 462 00:23:00,320 --> 00:23:02,390 That's going to be germane to integers. 463 00:23:02,390 --> 00:23:06,680 And not null, which is my problem that this field cannot or shall not be null. 464 00:23:06,680 --> 00:23:10,470 So I will go ahead and do this and just promise that every student in Frosh IMs 465 00:23:10,470 --> 00:23:12,990 is going to have a name and a dorm. 466 00:23:12,990 --> 00:23:14,910 Meanwhile, you can specify a default value. 467 00:23:14,910 --> 00:23:17,350 I'm not going to bother doing that here, but you can see from the drop down 468 00:23:17,350 --> 00:23:19,860 that you can specify a certain value, like null, 469 00:23:19,860 --> 00:23:23,310 or you can specify the current date or time, if that's actually germane. 470 00:23:23,310 --> 00:23:25,227 But I don't want to give people generic names. 471 00:23:25,227 --> 00:23:28,143 And I don't want to just assume that people live in some default dorm. 472 00:23:28,143 --> 00:23:29,950 So I'm going to leave that alone and expect 473 00:23:29,950 --> 00:23:33,050 that the user of this database table is always 474 00:23:33,050 --> 00:23:34,600 going to give me a name and a dorm. 475 00:23:34,600 --> 00:23:36,810 So let me go ahead and click Create. 476 00:23:36,810 --> 00:23:38,870 And what you'll see is this. 477 00:23:38,870 --> 00:23:43,000 phpLiteAdmin is really just a handy web-based tool 478 00:23:43,000 --> 00:23:45,490 for executing SQL statements. 479 00:23:45,490 --> 00:23:48,440 So I don't necessarily have to remember all of the syntax up front. 480 00:23:48,440 --> 00:23:50,590 I can kind of learn from this actual tool. 481 00:23:50,590 --> 00:23:54,740 And what phpLiteAdmin has really done for me is this. 482 00:23:54,740 --> 00:23:59,760 It has in that file called Lecture.db executed this statement, 483 00:23:59,760 --> 00:24:05,640 Create Table quote unquote registrants, Name is text not null, 484 00:24:05,640 --> 00:24:08,790 Dorm is text not null, and that's it. 485 00:24:08,790 --> 00:24:12,520 So phpLiteAdmin is just saving me, at least for today's purposes, 486 00:24:12,520 --> 00:24:14,872 from having to remember exactly all that syntax. 487 00:24:14,872 --> 00:24:16,830 But after practice, it becomes pretty familiar. 488 00:24:16,830 --> 00:24:19,390 And certainly with Google, you can fill in any blanks. 489 00:24:19,390 --> 00:24:20,690 But what does this now mean? 490 00:24:20,690 --> 00:24:25,250 If I click on Return here and go back to not just 491 00:24:25,250 --> 00:24:30,440 the lecture where I was, but I click now on Registrants, my actual table, 492 00:24:30,440 --> 00:24:32,200 you'll see a bunch more tabs. 493 00:24:32,200 --> 00:24:35,270 The top one of which is Browse-- but this table 494 00:24:35,270 --> 00:24:38,660 is empty, because I've not inserted me or Rob or anyone else for that matter. 495 00:24:38,660 --> 00:24:42,100 But if I click on Structure, you can now see in a web-based environment 496 00:24:42,100 --> 00:24:45,650 a little reminder as to what this table looks like underneath the hood. 497 00:24:45,650 --> 00:24:48,740 It's more esoteric looking than something like Google Spreadsheets, 498 00:24:48,740 --> 00:24:50,340 where you just have columns and rows. 499 00:24:50,340 --> 00:24:51,710 Here we're being more precise. 500 00:24:51,710 --> 00:24:53,380 I have two fields, Name and Dorm. 501 00:24:53,380 --> 00:24:55,180 Each is of type text. 502 00:24:55,180 --> 00:24:56,260 Neither can be null. 503 00:24:56,260 --> 00:24:59,140 But neither has a default value, and neither is a primary key. 504 00:24:59,140 --> 00:25:03,410 But via the Edit link here and Delete, I can actually change those definitions, 505 00:25:03,410 --> 00:25:06,280 and I can even add more fields if I think of things later. 506 00:25:06,280 --> 00:25:09,420 And so this is allowing me to create really 507 00:25:09,420 --> 00:25:18,250 a table of information whereby these are my columns or fields Name and Dorm. 508 00:25:18,250 --> 00:25:21,130 So let's go ahead and insert for instance me in here. 509 00:25:21,130 --> 00:25:24,690 So I'm going to go ahead and insert David, who is in from Matthews, 510 00:25:24,690 --> 00:25:26,200 and click Insert. 511 00:25:26,200 --> 00:25:28,380 And notice what phpLiteAdmin did. 512 00:25:28,380 --> 00:25:31,630 It simply executed this SQL command. 513 00:25:31,630 --> 00:25:34,700 Insert into Registrants, Name,Dorm. 514 00:25:34,700 --> 00:25:38,840 So it's specifying what the two fields are that I want to insert into. 515 00:25:38,840 --> 00:25:40,470 And what values do I want to put? 516 00:25:40,470 --> 00:25:45,520 David,Matthews, not to be confused with Dave Matthews. 517 00:25:45,520 --> 00:25:49,580 So now, if I click Return and go back to the Browse tab, 518 00:25:49,580 --> 00:25:51,950 notice that we see Davids and Matthews. 519 00:25:51,950 --> 00:25:56,520 Meanwhile, if I click Insert Here, and I type something like Rob and Thayer, 520 00:25:56,520 --> 00:26:00,480 click Insert, a very similar SQL statement was executed. 521 00:26:00,480 --> 00:26:02,730 But this one customized for Rob. 522 00:26:02,730 --> 00:26:06,129 And now Return and see Browse, both of us are in there. 523 00:26:06,129 --> 00:26:08,420 But now let me start to take those training wheels off. 524 00:26:08,420 --> 00:26:11,370 It's not all that interesting just to click around in a web GUI. 525 00:26:11,370 --> 00:26:13,120 Let me actually learn something from this. 526 00:26:13,120 --> 00:26:15,710 And no, let me go ahead and type my own SQL now. 527 00:26:15,710 --> 00:26:18,467 Insert into Registrants. 528 00:26:18,467 --> 00:26:20,550 And if I really want to be proper, I can quote it. 529 00:26:20,550 --> 00:26:22,758 But so long as you don't have any special characters, 530 00:26:22,758 --> 00:26:27,040 quoting symbols is not strictly necessary in this case. 531 00:26:27,040 --> 00:26:28,830 What fields do I want to insert into? 532 00:26:28,830 --> 00:26:30,070 Name and Dorm. 533 00:26:30,070 --> 00:26:31,730 And what values do I want to insert? 534 00:26:31,730 --> 00:26:32,770 Raw values. 535 00:26:32,770 --> 00:26:35,770 Textual values that are not special keywords or field names. 536 00:26:35,770 --> 00:26:38,850 I do need to quote with single quotes here in SQLite. 537 00:26:38,850 --> 00:26:45,010 Let's go ahead and put [INAUDIBLE] and Currier for Dorm or House there. 538 00:26:45,010 --> 00:26:46,300 I can put a semicolon. 539 00:26:46,300 --> 00:26:48,900 It's OK if I omit it, but if I want to execute multiple SQL 540 00:26:48,900 --> 00:26:51,550 statements at once, I will need the semicolon again. 541 00:26:51,550 --> 00:26:53,360 Let me go ahead and click Go. 542 00:26:53,360 --> 00:26:55,240 And nicely enough, notice what happened. 543 00:26:55,240 --> 00:26:56,460 One row was affected. 544 00:26:56,460 --> 00:26:59,670 It took only 0.01 seconds, so it's pretty darn fast. 545 00:26:59,670 --> 00:27:02,430 This is just a reminder of what it is I executed. 546 00:27:02,430 --> 00:27:07,900 And if I go back now to Browse, I can actually see three rows in the table. 547 00:27:07,900 --> 00:27:10,770 All right, so now things are getting a little more interesting. 548 00:27:10,770 --> 00:27:12,400 What if I want to search for things? 549 00:27:12,400 --> 00:27:14,524 I'm not even going to use this training wheel here. 550 00:27:14,524 --> 00:27:18,489 I could use the Search tab and type in some names or some dorms 551 00:27:18,489 --> 00:27:19,530 that I want to search on. 552 00:27:19,530 --> 00:27:20,740 But that's not the goal here. 553 00:27:20,740 --> 00:27:23,489 The goal here is not to teach phpLiteAdmin or this web-based tool. 554 00:27:23,489 --> 00:27:26,570 It's just to use pretty quickly this kind of web-based environment 555 00:27:26,570 --> 00:27:30,580 to create the schema for our database, the design of the database, 556 00:27:30,580 --> 00:27:34,890 our choices of tables and columns and perhaps some of the initial data, 557 00:27:34,890 --> 00:27:37,704 and then use that database in actual code. 558 00:27:37,704 --> 00:27:38,870 So that's where we're going. 559 00:27:38,870 --> 00:27:40,750 We're going to move away from this web-based environment 560 00:27:40,750 --> 00:27:42,870 and use actual Python code, but first let's 561 00:27:42,870 --> 00:27:45,260 see some of those other instructions. 562 00:27:45,260 --> 00:27:49,750 Turns out, per this little summary here, that we have the ability 563 00:27:49,750 --> 00:27:54,390 to create-- which we already did using Create Table-- to instert-- 564 00:27:54,390 --> 00:27:55,370 which I just did. 565 00:27:55,370 --> 00:27:57,230 But what about select, update, and delete? 566 00:27:57,230 --> 00:27:58,870 Let's not use the web GUI per se. 567 00:27:58,870 --> 00:28:01,450 Let's just go ahead and start typing some raw SQL. 568 00:28:01,450 --> 00:28:03,370 I'm going to go ahead and do exactly this. 569 00:28:03,370 --> 00:28:08,900 Select let's say star from Registrants. 570 00:28:08,900 --> 00:28:11,320 And now as an aside, stylistically it's not strictly 571 00:28:11,320 --> 00:28:14,850 necessary to write select or from or all of the SQL key words in all caps. 572 00:28:14,850 --> 00:28:17,360 Tends to make code I think a little more readable. 573 00:28:17,360 --> 00:28:20,580 But I could still just say select star from Registrants. 574 00:28:20,580 --> 00:28:24,590 But I find it nice to distinguish what are built in keywords from like what 575 00:28:24,590 --> 00:28:26,510 are my field names and table names. 576 00:28:26,510 --> 00:28:28,470 So select star-- star meaning everything. 577 00:28:28,470 --> 00:28:29,720 Let's click Go. 578 00:28:29,720 --> 00:28:34,000 And now what I see here is this web-based representation 579 00:28:34,000 --> 00:28:36,800 of two columns with three rows that have come back. 580 00:28:36,800 --> 00:28:39,060 We're going to start calling this a result set. 581 00:28:39,060 --> 00:28:41,874 A result set containing three rows have come back. 582 00:28:41,874 --> 00:28:43,790 And that's going to map to code pretty nicely. 583 00:28:43,790 --> 00:28:45,123 Now, let's go ahead and do this. 584 00:28:45,123 --> 00:28:49,140 Suppose that Zamyla moves to a different dormitory. 585 00:28:49,140 --> 00:28:54,320 And let's go ahead and type this-- update registrants, 586 00:28:54,320 --> 00:29:05,000 set Dorm equal to let's say Grays where Name equals quote unquote Zamyla. 587 00:29:05,000 --> 00:29:07,049 And this is what's particularly nice about SQL. 588 00:29:07,049 --> 00:29:09,590 Even though the syntax is a bit new and some of the key words 589 00:29:09,590 --> 00:29:12,610 are certainly new, you kind of just say what you mean. 590 00:29:12,610 --> 00:29:16,410 So update the Registrants table, setting the Dorm field or column 591 00:29:16,410 --> 00:29:17,400 equal to Grays. 592 00:29:17,400 --> 00:29:22,870 Not for everyone, only where the name field or column equals Zamyla. 593 00:29:22,870 --> 00:29:26,090 So let me go ahead and click Go here. 594 00:29:26,090 --> 00:29:28,430 All right, one row is affected up here. 595 00:29:28,430 --> 00:29:31,100 And notice, a reminder of what I just executed. 596 00:29:31,100 --> 00:29:34,200 Now if I click on Browse just to see what the data looks like, 597 00:29:34,200 --> 00:29:38,694 indeed Zamyla has relocated to a different dorm or building altogether. 598 00:29:38,694 --> 00:29:41,110 If I screwed up, though, notice what could quickly happen. 599 00:29:41,110 --> 00:29:47,550 Suppose I did this, Update, Registrants, set Dorm equal to Grays, 600 00:29:47,550 --> 00:29:49,710 and I left off the so-called predicate, the 601 00:29:49,710 --> 00:29:53,090 Where clause that I did a moment ago and click Go. 602 00:29:53,090 --> 00:29:56,780 Now three rows were affected as indicated here. 603 00:29:56,780 --> 00:30:00,720 And if I click Browse now, notice that oops, all three of us 604 00:30:00,720 --> 00:30:02,090 now live in Grays. 605 00:30:02,090 --> 00:30:04,300 So I can fix this manually if I really wanted to 606 00:30:04,300 --> 00:30:05,860 without executing another SQL query. 607 00:30:05,860 --> 00:30:07,900 But this is just meant to be a user-friendly way 608 00:30:07,900 --> 00:30:11,250 of quickly and dirtily editing or creating some data. 609 00:30:11,250 --> 00:30:13,930 Really, we're going to start using those SQL commands only. 610 00:30:13,930 --> 00:30:14,730 What about Delete? 611 00:30:14,730 --> 00:30:17,930 Suppose that Rob has graduated, and he's moving out. 612 00:30:17,930 --> 00:30:27,800 Let me go ahead then and do Delete from registrants, where Name equals Rob. 613 00:30:27,800 --> 00:30:29,180 Go ahead and click Go. 614 00:30:29,180 --> 00:30:30,320 One row effected. 615 00:30:30,320 --> 00:30:34,850 And if I go and click Browse now, only Zamyla and I remain. 616 00:30:34,850 --> 00:30:38,230 All right, so those are just a few such keywords. 617 00:30:38,230 --> 00:30:40,720 But what's nice is that again, it's pretty expressive. 618 00:30:40,720 --> 00:30:42,990 So to recap here, how did I create the table? 619 00:30:42,990 --> 00:30:44,870 I literally said Create Table Registrants. 620 00:30:44,870 --> 00:30:47,370 Though I didn't type this at the time, but this was just the 621 00:30:47,370 --> 00:30:50,649 SQL that that web tool generated for me. 622 00:30:50,649 --> 00:30:52,440 Spoiler alert, don't look at that just yet. 623 00:30:52,440 --> 00:30:53,670 We're going to come back to that in a moment. 624 00:30:53,670 --> 00:30:56,295 And I specified that the Name field is going to be of type text 625 00:30:56,295 --> 00:30:58,310 and Dorm is going to be of type text as well. 626 00:30:58,310 --> 00:31:01,160 Down here I then inserted something like David from Matthews, 627 00:31:01,160 --> 00:31:02,340 specifying name and dorm. 628 00:31:02,340 --> 00:31:04,340 And notice, these lines wrapped onto two lines, 629 00:31:04,340 --> 00:31:06,965 but in general, this could just be a really long query as well. 630 00:31:06,965 --> 00:31:09,270 Select Star from Registrants, Update registrants, 631 00:31:09,270 --> 00:31:12,617 Delete from registrants where-- I get another little spoiler. 632 00:31:12,617 --> 00:31:14,450 And it's a spoiler because there is actually 633 00:31:14,450 --> 00:31:17,810 a flaw in my database table's design. 634 00:31:17,810 --> 00:31:20,400 My Registrants table was not so well thought through. 635 00:31:20,400 --> 00:31:24,980 I indeed only care about storing names and dorms, at least right now. 636 00:31:24,980 --> 00:31:27,510 Maybe later, emails and phone numbers. 637 00:31:27,510 --> 00:31:29,570 But what could go wrong? 638 00:31:29,570 --> 00:31:32,700 When I deleted Rob, something could have gone wrong a moment ago. 639 00:31:32,700 --> 00:31:37,520 When I updated Zamyla, something could have gone wrong. 640 00:31:37,520 --> 00:31:39,990 What if there are two Zamylas at Harvard? 641 00:31:39,990 --> 00:31:42,160 What if there are two Robs at Harvard? 642 00:31:42,160 --> 00:31:47,090 Well, each of those queries didn't seem to specify which Rob or which Zamyla. 643 00:31:47,090 --> 00:31:50,900 It just said where the name is Rob or the name is Zamyla. 644 00:31:50,900 --> 00:31:52,480 So how do we distinguish the two? 645 00:31:52,480 --> 00:31:55,360 Well, maybe I should have forced undergrads or students 646 00:31:55,360 --> 00:31:58,140 to register with their full names, Zamyla Chan. 647 00:31:58,140 --> 00:32:00,130 OK, so that would decrease the probability 648 00:32:00,130 --> 00:32:02,840 that we're going to have multiples Zamyla Chans at Harvard. 649 00:32:02,840 --> 00:32:04,870 But still there could be that probability. 650 00:32:04,870 --> 00:32:09,270 Same with Rob Bowden, or really any name, whether common or uncommon. 651 00:32:09,270 --> 00:32:11,210 So that doesn't feel very robust. 652 00:32:11,210 --> 00:32:14,150 We could look for a Zamyla that specifically lived in Korea, 653 00:32:14,150 --> 00:32:16,290 but maybe on the off chance there's two of them 654 00:32:16,290 --> 00:32:18,020 there, that's going to create problems. 655 00:32:18,020 --> 00:32:21,480 And we might update or delete more rows than we intend. 656 00:32:21,480 --> 00:32:24,010 So this is why we humans in the US for instance 657 00:32:24,010 --> 00:32:26,570 have unique identifiers, for better or for worse. 658 00:32:26,570 --> 00:32:28,480 Things like Social Security numbers. 659 00:32:28,480 --> 00:32:32,780 Or here at Harvard we have Harvard University IDs-- HUID numbers-- or at 660 00:32:32,780 --> 00:32:34,037 Yale MIT IDs. 661 00:32:34,037 --> 00:32:37,120 Or where you are in the world, odds are you have in your wallet or at home 662 00:32:37,120 --> 00:32:39,619 or somewhere in your life, a unique identifier. 663 00:32:39,619 --> 00:32:41,910 And you might not even know what those identifiers are. 664 00:32:41,910 --> 00:32:44,100 In fact, anytime you've registered for an account 665 00:32:44,100 --> 00:32:47,856 on some website like Facebook or Google or any number of others, 666 00:32:47,856 --> 00:32:50,480 underneath the hood, even if those companies aren't telling you 667 00:32:50,480 --> 00:32:56,360 what your unique ID is, they know you by more than just your name and dorm 668 00:32:56,360 --> 00:32:58,640 or email address or phone number. 669 00:32:58,640 --> 00:33:02,850 Odds are, they have assigned you some unique but very simple 670 00:33:02,850 --> 00:33:08,480 piece of data, like an integer, so that Zamyla might be user number three. 671 00:33:08,480 --> 00:33:09,520 Rob might be two. 672 00:33:09,520 --> 00:33:10,460 David might be one. 673 00:33:10,460 --> 00:33:14,240 Simply because that's the order in reverse in which I inserted them 674 00:33:14,240 --> 00:33:15,320 into the table. 675 00:33:15,320 --> 00:33:17,430 So let me go ahead actually and fix this. 676 00:33:17,430 --> 00:33:19,630 Let me go back into my database. 677 00:33:19,630 --> 00:33:20,400 And you know what? 678 00:33:20,400 --> 00:33:21,774 I'm going to be pretty bold here. 679 00:33:21,774 --> 00:33:25,320 I'm going to go ahead and in my table drop the whole thing. 680 00:33:25,320 --> 00:33:27,370 So beware ever executing the drop command, 681 00:33:27,370 --> 00:33:30,280 because it will literally drop all the data in your table. 682 00:33:30,280 --> 00:33:33,950 But I'm going to confirm that, I'm going to start over as follows. 683 00:33:33,950 --> 00:33:36,240 I'm going to create a new table called Registrants, 684 00:33:36,240 --> 00:33:38,370 this time with three fields. 685 00:33:38,370 --> 00:33:40,450 And in addition this time to Name which will 686 00:33:40,450 --> 00:33:45,080 be text and Dorm which will be text, this time I'm going to think ahead. 687 00:33:45,080 --> 00:33:48,370 And I'm going to give everyone a unique ID. 688 00:33:48,370 --> 00:33:51,190 Convention is to call it literally id in all lower case. 689 00:33:51,190 --> 00:33:54,150 And indeed, field name should generally not have spaces, not 690 00:33:54,150 --> 00:33:56,050 have funky punctuation. 691 00:33:56,050 --> 00:33:59,000 Whether you use so-called camel case with alternating capital letters 692 00:33:59,000 --> 00:34:01,600 or snake case with underscores, it's generally up to you. 693 00:34:01,600 --> 00:34:03,910 But using underscores tends to be the convention here, 694 00:34:03,910 --> 00:34:05,610 although it's not yet to remain. 695 00:34:05,610 --> 00:34:08,590 But I'm going to make this id field an integer, 696 00:34:08,590 --> 00:34:11,199 and I'm also going to make it my primary key. 697 00:34:11,199 --> 00:34:16,900 So that there shall be no two registrants in this Frosh IMs program 698 00:34:16,900 --> 00:34:20,937 that have the same primary key, the same number uniquely identifying them. 699 00:34:20,937 --> 00:34:22,770 And you know, a better yet-- and this is one 700 00:34:22,770 --> 00:34:25,940 of the features you get for free with many relational databases-- 701 00:34:25,940 --> 00:34:26,730 auto increment. 702 00:34:26,730 --> 00:34:29,679 I the programmer don't care what number Zamyla is. 703 00:34:29,679 --> 00:34:31,710 I don't care what number David or Rob is. 704 00:34:31,710 --> 00:34:35,020 I just want them to have a number, whatever it is. 705 00:34:35,020 --> 00:34:37,520 But I don't want them to be the same. 706 00:34:37,520 --> 00:34:40,299 So let me just let my database auto increment this ID. 707 00:34:40,299 --> 00:34:43,090 Give the first user one, the second user two, the third user three, 708 00:34:43,090 --> 00:34:44,230 and so forth. 709 00:34:44,230 --> 00:34:46,480 And by nature, this cannot be null. 710 00:34:46,480 --> 00:34:48,179 They need to be uniquely identified. 711 00:34:48,179 --> 00:34:50,790 And default value here doesn't matter, because the database 712 00:34:50,790 --> 00:34:52,270 is going to take care of that. 713 00:34:52,270 --> 00:34:54,280 Let me go ahead now and click Create. 714 00:34:54,280 --> 00:34:58,080 And now you'll see the create table that I just 715 00:34:58,080 --> 00:34:59,890 spoiled a moment ago on the slide. 716 00:34:59,890 --> 00:35:02,230 This now creates a table again called Registrants, 717 00:35:02,230 --> 00:35:04,170 because I dropped or deleted the old one. 718 00:35:04,170 --> 00:35:08,250 But notice how much detail we have specified what id is. 719 00:35:08,250 --> 00:35:12,660 It is not only an integer, it is also a primary key that supports auto 720 00:35:12,660 --> 00:35:15,250 increment and it cannot be null. 721 00:35:15,250 --> 00:35:18,590 Meanwhile, Name, Dorm, I don't particularly care this time. 722 00:35:18,590 --> 00:35:22,260 Though I could specify like before, I don't want those to be null either. 723 00:35:22,260 --> 00:35:25,280 So again, that's just going to be a design decision. 724 00:35:25,280 --> 00:35:27,230 But let me see how this affects things. 725 00:35:27,230 --> 00:35:31,706 Now let me go into MySQL tab, where I can just execute some raw SQL. 726 00:35:31,706 --> 00:35:34,930 And soon we're going to be executing SQL from within Python code. 727 00:35:34,930 --> 00:35:40,800 Let me go ahead and insert into Registrants a name and a dorm. 728 00:35:40,800 --> 00:35:43,290 And notice, I don't care about specifying the ID. 729 00:35:43,290 --> 00:35:45,010 That's not interesting to me. 730 00:35:45,010 --> 00:35:51,110 All I care about is the values like David and Matthews. 731 00:35:51,110 --> 00:35:55,530 And if I now click Go, notice that's all that the database executes. 732 00:35:55,530 --> 00:35:58,590 It doesn't go in and secretly change my query, 733 00:35:58,590 --> 00:36:02,950 but it does notice that I've omitted explicit mention of that id field. 734 00:36:02,950 --> 00:36:07,470 And if I click Browse, now notice, David for Matthews has an id of one. 735 00:36:07,470 --> 00:36:11,640 Let's go ahead and do two other inserts, again via raw SQL. 736 00:36:11,640 --> 00:36:15,810 Let me go ahead here and do insert into Registrants. 737 00:36:15,810 --> 00:36:18,782 And you'll notice there's a nice little history in this web tool of all 738 00:36:18,782 --> 00:36:20,240 the previous queries I've executed. 739 00:36:20,240 --> 00:36:23,580 So in case you want to click and save yourself some typing, you can do that. 740 00:36:23,580 --> 00:36:28,140 But for now I'm just going to go ahead and insert Rob as before from Thayer. 741 00:36:28,140 --> 00:36:31,930 And for good measure at the same time, Registrants, 742 00:36:31,930 --> 00:36:37,230 Name Dorm values Zamyla from Currier. 743 00:36:37,230 --> 00:36:39,690 And notice the semicolon, so that hopefully both of these 744 00:36:39,690 --> 00:36:41,450 should go through as separate queries. 745 00:36:41,450 --> 00:36:42,820 Indeed they did. 746 00:36:42,820 --> 00:36:43,760 One row affected. 747 00:36:43,760 --> 00:36:44,630 One row affected. 748 00:36:44,630 --> 00:36:48,190 And if I click Browse, now Rob is number two, Zamyla is number three. 749 00:36:48,190 --> 00:36:52,550 And this is just going to continue on until we maybe run out of integers. 750 00:36:52,550 --> 00:36:58,790 But SQLite will support as many as 8 byte integers, or 64-bit values. 751 00:36:58,790 --> 00:37:01,610 So that's not going to happen for quite some time. 752 00:37:01,610 --> 00:37:04,660 So now notice what I'm protected against. 753 00:37:04,660 --> 00:37:07,980 There might be a whole bunch of Robs at Harvard it's fair to say. 754 00:37:07,980 --> 00:37:15,250 But now if I do a Delete from Registrants 755 00:37:15,250 --> 00:37:20,630 where name equals Rob and id equals two-- 756 00:37:20,630 --> 00:37:23,380 and I don't have to quote the two, because it is indeed a number-- 757 00:37:23,380 --> 00:37:25,380 this is actually more information than I need. 758 00:37:25,380 --> 00:37:29,610 Indeed, because id is a primary key, I don't need to mention Rob's name. 759 00:37:29,610 --> 00:37:33,210 If I already know his id, I can just say Delete from Registrants 760 00:37:33,210 --> 00:37:36,290 where id equals two, click Go. 761 00:37:36,290 --> 00:37:38,060 One row affected. 762 00:37:38,060 --> 00:37:40,050 And goodbye to Rob. 763 00:37:40,050 --> 00:37:44,780 And what's nice now about SQLite is suppose I do one other insert. 764 00:37:44,780 --> 00:37:52,120 Let me go ahead and Insert into Registrants a name and a dorm 765 00:37:52,120 --> 00:37:56,960 with values of Jason and Kirkland. 766 00:37:56,960 --> 00:37:59,200 And click Go. 767 00:37:59,200 --> 00:38:05,080 Is Jason going to now be the new number two, or is he going to be number four? 768 00:38:05,080 --> 00:38:06,000 Let's take a look. 769 00:38:06,000 --> 00:38:08,325 If we click Browse here, he's indeed four. 770 00:38:08,325 --> 00:38:12,070 So one of the features you get by using the auto increment field 771 00:38:12,070 --> 00:38:15,420 is that the database is going to ensure that just in case 772 00:38:15,420 --> 00:38:17,670 you have remnants of Rob somewhere else, or really 773 00:38:17,670 --> 00:38:21,660 his id number maybe in some other table or in your logs or some piece of data 774 00:38:21,660 --> 00:38:25,540 you care about, we're going to make sure that Jason has not only a unique id 775 00:38:25,540 --> 00:38:29,010 now, but a unique id in perpetuity. 776 00:38:29,010 --> 00:38:32,230 If you don't use unique id-- or rather auto increments, 777 00:38:32,230 --> 00:38:34,640 SQLite will actually try to reuse numbers just 778 00:38:34,640 --> 00:38:39,730 for efficiency, which can actually break things down the road later. 779 00:38:39,730 --> 00:38:41,940 All right, so things are getting interesting. 780 00:38:41,940 --> 00:38:46,180 We have the ability now to insert data, delete data, update data, 781 00:38:46,180 --> 00:38:47,510 and then select it. 782 00:38:47,510 --> 00:38:49,320 What more can we actually do? 783 00:38:49,320 --> 00:38:55,150 Well, it turns out that SQL tends to come with not only various data types, 784 00:38:55,150 --> 00:38:56,340 but also various functions. 785 00:38:56,340 --> 00:38:59,810 It turns out to save ourselves trouble, there are functions like date functions 786 00:38:59,810 --> 00:39:01,460 and times and date find times. 787 00:39:01,460 --> 00:39:04,610 So for instance, I want to execute a query like select star 788 00:39:04,610 --> 00:39:05,520 from registrants. 789 00:39:05,520 --> 00:39:07,645 But no, I don't want to select all the registrants. 790 00:39:07,645 --> 00:39:10,700 Maybe I want to select all of the registrants this year. 791 00:39:10,700 --> 00:39:13,670 I could say something like, select star from Registrants 792 00:39:13,670 --> 00:39:19,790 where date of a particular field is greater than or equal to January 1st, 793 00:39:19,790 --> 00:39:24,450 2016 or 2017, or whatever the year happens to be. 794 00:39:24,450 --> 00:39:27,230 So these are functions that can do that kind of thinking for me. 795 00:39:27,230 --> 00:39:31,950 But SQL also supports a number of even more compelling features. 796 00:39:31,950 --> 00:39:35,210 We saw a moment ago primary key, which again specifies 797 00:39:35,210 --> 00:39:39,770 that this one and only field shall uniquely identify all of the rows 798 00:39:39,770 --> 00:39:42,660 in my table, I guarantee it. 799 00:39:42,660 --> 00:39:45,530 There's other types of indexes you can use. 800 00:39:45,530 --> 00:39:48,851 So this is the special type of index. 801 00:39:48,851 --> 00:39:51,850 And essentially what you're getting is underneath the hood the database, 802 00:39:51,850 --> 00:39:54,920 whether it's SQLite or MySQL or Postgres or Oracle or whatever, 803 00:39:54,920 --> 00:40:00,280 is generally leveraging some of that week five, week four, week six material 804 00:40:00,280 --> 00:40:03,490 from CS50 in the fancier data structures, especially the tree data 805 00:40:03,490 --> 00:40:04,110 structures. 806 00:40:04,110 --> 00:40:06,920 And if we tell the database in advance, hey, this field's 807 00:40:06,920 --> 00:40:08,880 going to be my primary key, or this field 808 00:40:08,880 --> 00:40:13,940 is going to be an index more generally, whoever implemented the database 809 00:40:13,940 --> 00:40:19,340 software itself probably built into it the ability to store that field using 810 00:40:19,340 --> 00:40:21,990 a fancy tree or some other data structure that ultimately 811 00:40:21,990 --> 00:40:24,790 is going to expedite selecting and updating 812 00:40:24,790 --> 00:40:26,300 and inserting and deleting data. 813 00:40:26,300 --> 00:40:28,560 And indeed, we can say exactly that. 814 00:40:28,560 --> 00:40:32,280 If you know in advance that you're going to be searching on a certain field all 815 00:40:32,280 --> 00:40:34,460 the time-- not just the id field. 816 00:40:34,460 --> 00:40:39,260 Maybe your table has an email field or maybe a zip code field or some other 817 00:40:39,260 --> 00:40:43,680 field that users might want to search on, rather than just store it 818 00:40:43,680 --> 00:40:47,170 in a column by defining a field in your database, 819 00:40:47,170 --> 00:40:51,070 and then really just letting the database use linear search to find 820 00:40:51,070 --> 00:40:54,890 everyone in 02138 or 90210 or whatever zip code, 821 00:40:54,890 --> 00:40:59,040 you can actually tell your database in advance, index this zip code field, 822 00:40:59,040 --> 00:41:00,050 index this email field. 823 00:41:00,050 --> 00:41:03,170 Because I know I'm going to be searching on that specifically using 824 00:41:03,170 --> 00:41:05,970 that special Where keyword in SQL. 825 00:41:05,970 --> 00:41:10,330 And let the database give you better performance than something 826 00:41:10,330 --> 00:41:11,520 like linear search alone. 827 00:41:11,520 --> 00:41:15,272 Meanwhile, you can also specify, I'm not going to just search on this field. 828 00:41:15,272 --> 00:41:16,980 I don't care just about it's performance. 829 00:41:16,980 --> 00:41:17,790 You know what? 830 00:41:17,790 --> 00:41:21,760 I can promise you right now this field shall be unique. 831 00:41:21,760 --> 00:41:23,617 It might be an integer like a primary key. 832 00:41:23,617 --> 00:41:25,700 But if there's some other field in a database that 833 00:41:25,700 --> 00:41:29,130 in advance is going to be unique by design, 834 00:41:29,130 --> 00:41:30,810 you can tell the database as much. 835 00:41:30,810 --> 00:41:35,940 And the database too will optimize queries involving that field as well. 836 00:41:35,940 --> 00:41:40,660 So among the fields we discussed today-- name, dorm, phone, email, zip code, 837 00:41:40,660 --> 00:41:42,770 you can probably think of bunches of others-- 838 00:41:42,770 --> 00:41:45,280 which of those might be candidate to be unique 839 00:41:45,280 --> 00:41:48,030 and to be indexed as unique by your database? 840 00:41:48,030 --> 00:41:49,870 Name already feels like a bad idea. 841 00:41:49,870 --> 00:41:52,980 We want to be able to have multiple Zamylas, multiple Robs, multiple 842 00:41:52,980 --> 00:41:54,780 of any name in the world. 843 00:41:54,780 --> 00:41:57,630 Probably want multiple people to live in the same dorm, 844 00:41:57,630 --> 00:42:02,790 so that there can't be just one person from Currier, Matthews, or Thayer, 845 00:42:02,790 --> 00:42:04,600 wherever. 846 00:42:04,600 --> 00:42:06,925 What about email? 847 00:42:06,925 --> 00:42:10,000 Email's kind of an interesting candidate, right? 848 00:42:10,000 --> 00:42:15,180 Assuming you're OK with banning users-- maybe a couple of people in your life 849 00:42:15,180 --> 00:42:18,550 who happen to use the same email address for whatever reason. 850 00:42:18,550 --> 00:42:21,980 So long as you're comfortable assuming that every human in the world 851 00:42:21,980 --> 00:42:25,940 is going to have or must have his or her own email address, 852 00:42:25,940 --> 00:42:28,440 you could proactively say to your database, 853 00:42:28,440 --> 00:42:31,760 OK, the email field or the email column shall be unique. 854 00:42:31,760 --> 00:42:36,830 I don't want a user with the same email address to accidentally register twice, 855 00:42:36,830 --> 00:42:38,160 let alone three times or more. 856 00:42:38,160 --> 00:42:41,284 And even if two people think they have the same email address maybe because 857 00:42:41,284 --> 00:42:43,730 of a typo, I don't want him or her to be able to register 858 00:42:43,730 --> 00:42:46,670 if someone with that address has already been registered. 859 00:42:46,670 --> 00:42:49,112 Phone number maybe works OK in the world of mobiles, 860 00:42:49,112 --> 00:42:50,820 but if people still have landline phones, 861 00:42:50,820 --> 00:42:53,525 you probably don't want to enforce uniqueness there. 862 00:42:53,525 --> 00:42:55,150 But what else might you want to impose? 863 00:42:55,150 --> 00:42:55,950 Not null. 864 00:42:55,950 --> 00:42:59,990 If you want to ensure that everyone in your database has a name, 865 00:42:59,990 --> 00:43:04,620 there's nothing stopping them inherently from typing like ASDF or whatever 866 00:43:04,620 --> 00:43:05,650 into your database. 867 00:43:05,650 --> 00:43:07,840 You might have to protect against bogus data. 868 00:43:07,840 --> 00:43:09,560 But you can at least ensure that the user 869 00:43:09,560 --> 00:43:12,880 has to give you some kind of value. 870 00:43:12,880 --> 00:43:14,060 And then foreign key. 871 00:43:14,060 --> 00:43:15,976 And we'll come back to this, but there's a way 872 00:43:15,976 --> 00:43:19,410 of specifying that the data in one spreadsheet or in our world now, 873 00:43:19,410 --> 00:43:23,700 one table, is somehow related to or identical to the data 874 00:43:23,700 --> 00:43:27,710 in another spreadsheet or again a table. 875 00:43:27,710 --> 00:43:30,560 But let's come back to some of these features in a moment. 876 00:43:30,560 --> 00:43:32,330 What else do we get from SQL? 877 00:43:32,330 --> 00:43:36,440 We also get the ability to join tables together. 878 00:43:36,440 --> 00:43:37,780 Now what does this mean? 879 00:43:37,780 --> 00:43:41,240 Well, let's go ahead and let's go back to the spreadsheet for just a moment. 880 00:43:41,240 --> 00:43:45,110 And let's not worry about students and professors anymore, but let's go ahead 881 00:43:45,110 --> 00:43:49,060 and think of this as, for instance, users. 882 00:43:49,060 --> 00:43:53,560 I'm going to rename my quick and dirty spreadsheet here Users. 883 00:43:53,560 --> 00:43:56,550 And what do I want every user in my world to have? 884 00:43:56,550 --> 00:43:58,610 Well, I want everyone to have some unique ID, 885 00:43:58,610 --> 00:44:01,070 and I'll let my database ultimately assign that. 886 00:44:01,070 --> 00:44:03,190 I want everyone to have a name. 887 00:44:03,190 --> 00:44:10,290 Maybe let's say a mailing address, a phone number, an email address, 888 00:44:10,290 --> 00:44:12,940 and there could be bunches of other fields as well. 889 00:44:12,940 --> 00:44:14,900 What else might someone have? 890 00:44:14,900 --> 00:44:17,450 That's probably enough for a customer database. 891 00:44:17,450 --> 00:44:20,740 And let's go ahead now and consider what these values might look like. 892 00:44:20,740 --> 00:44:24,010 So my very first customer, who buys my first widget or whatever, 893 00:44:24,010 --> 00:44:26,010 might be someone like Alice. 894 00:44:26,010 --> 00:44:32,950 And she lives at 1 Oxford Street, Cambridge, Mass, 02138, USA. 895 00:44:32,950 --> 00:44:35,830 Let's go ahead and make this column wider just so we can see it. 896 00:44:35,830 --> 00:44:42,110 Her phone number shall be 617-495-5000, and her email address 897 00:44:42,110 --> 00:44:45,120 shall be alice@example.com. 898 00:44:45,120 --> 00:44:47,330 Let me make this field a little wider as well. 899 00:44:47,330 --> 00:44:49,780 Then Bob comes along, and he buys something too. 900 00:44:49,780 --> 00:44:54,260 He happens to live down the street at 33 Oxford Street, Cambridge, Mass 901 00:44:54,260 --> 00:44:55,860 82138, USA. 902 00:44:55,860 --> 00:44:59,540 And his number is 617-495-9000. 903 00:44:59,540 --> 00:45:01,625 And he is just bob@example.com. 904 00:45:01,625 --> 00:45:04,860 Then there's someone new altogether, Charlie. 905 00:45:04,860 --> 00:45:12,210 He lives at 51 Prospect Street in New Haven, Connecticut, 906 00:45:12,210 --> 00:45:17,430 and his zip code is 06511, USA. 907 00:45:17,430 --> 00:45:20,770 And his number will be-- he doesn't want to cooperate, 908 00:45:20,770 --> 00:45:23,240 so he's just going to give us some bogus number there. 909 00:45:23,240 --> 00:45:25,420 But he'll be also charlie@example.com. 910 00:45:25,420 --> 00:45:28,550 And this table now-- if you think of the spreadsheet 911 00:45:28,550 --> 00:45:31,670 really as just a database table of users, 912 00:45:31,670 --> 00:45:35,120 could be dozens or hundreds or even thousands of rows long. 913 00:45:35,120 --> 00:45:38,020 So let's now consider what data types I should 914 00:45:38,020 --> 00:45:43,220 be using if I want to migrate this spreadsheet into an actual SQL 915 00:45:43,220 --> 00:45:44,200 database. 916 00:45:44,200 --> 00:45:48,042 So the id field, which frankly you usually get sort of for free 917 00:45:48,042 --> 00:45:51,000 in a spreadsheet program, because it just numbers all the rows for you. 918 00:45:51,000 --> 00:45:53,300 But if you resort your data, those numbers 919 00:45:53,300 --> 00:45:55,610 do not followed the original rows. 920 00:45:55,610 --> 00:45:59,301 And so giving our data, ultimately we'll see their own numbers as good. 921 00:45:59,301 --> 00:46:01,050 Because those should remain with the data. 922 00:46:01,050 --> 00:46:04,490 Alice should always now be one, Bob two, and Charlie three. 923 00:46:04,490 --> 00:46:05,500 So name. 924 00:46:05,500 --> 00:46:08,290 Let me go ahead and consider what should the fields here be? 925 00:46:08,290 --> 00:46:10,540 Well, just for the sake of discussion, let me go ahead 926 00:46:10,540 --> 00:46:13,810 and annotate right on the screen here. 927 00:46:13,810 --> 00:46:16,200 It stands to reason that this should be an integer. 928 00:46:16,200 --> 00:46:19,290 Name should probably be text. 929 00:46:19,290 --> 00:46:21,860 Address should probably be text. 930 00:46:21,860 --> 00:46:26,460 Phone, I mean it could be an integer. 931 00:46:26,460 --> 00:46:28,260 It's definitely not a real. 932 00:46:28,260 --> 00:46:33,140 Numeric is not right, because it could then become real somehow. 933 00:46:33,140 --> 00:46:34,370 But is integer right? 934 00:46:34,370 --> 00:46:36,060 I'm a little worried here. 935 00:46:36,060 --> 00:46:38,420 Like especially if it's an international customer, 936 00:46:38,420 --> 00:46:41,440 I'm kind of biased at the moment toward US users. 937 00:46:41,440 --> 00:46:45,650 But suppose someone typed in their actual calling code or country code 938 00:46:45,650 --> 00:46:51,830 and they did something like 011 or 001 for the US and then 617. 939 00:46:51,830 --> 00:46:54,460 If I called this field an integer, and then let 940 00:46:54,460 --> 00:46:58,750 the user type in their actual number like this, if it's an integer, 941 00:46:58,750 --> 00:47:01,352 those leading zeros mean nothing mathematically. 942 00:47:01,352 --> 00:47:04,030 So my database is probably going to throw them away. 943 00:47:04,030 --> 00:47:05,560 And I might not want that to happen. 944 00:47:05,560 --> 00:47:08,414 We get lucky in that this still could work as a phone number, 945 00:47:08,414 --> 00:47:10,580 but I really shouldn't be throwing away user's data. 946 00:47:10,580 --> 00:47:13,320 So you know what, I'm going to go ahead and just call this text. 947 00:47:13,320 --> 00:47:16,580 Whatever the user gives me, I shall store from him or her. 948 00:47:16,580 --> 00:47:19,750 And then finally email can be text. 949 00:47:19,750 --> 00:47:22,640 And now let's consider how we might index these columns. 950 00:47:22,640 --> 00:47:25,150 We'll come back to id in a moment, because SQL's 951 00:47:25,150 --> 00:47:27,940 going to give us that automatically in our database 952 00:47:27,940 --> 00:47:29,780 we'll see by using auto increment again. 953 00:47:29,780 --> 00:47:31,300 But it will be an integer. 954 00:47:31,300 --> 00:47:34,820 Name, I could-- I don't want to make it unique, 955 00:47:34,820 --> 00:47:37,490 because I want to have multiple Alices and multiple Bobs. 956 00:47:37,490 --> 00:47:38,820 I could make it an index. 957 00:47:38,820 --> 00:47:42,010 If for whatever reason the website or the application I'm making 958 00:47:42,010 --> 00:47:45,020 wants to make it easy for users to search on names, 959 00:47:45,020 --> 00:47:48,959 I could proactively say, OK, database index the name column, 960 00:47:48,959 --> 00:47:52,250 because it will speed up my searches to be something better than linear search. 961 00:47:52,250 --> 00:47:53,200 Address, same thing. 962 00:47:53,200 --> 00:47:56,082 If I want to use auto complete or some kind of search feature. 963 00:47:56,082 --> 00:47:57,290 So I can search over address. 964 00:47:57,290 --> 00:47:58,540 Maybe I want to do that. 965 00:47:58,540 --> 00:48:02,920 Phone number, maybe that could be unique, if I'm expecting only mobiles. 966 00:48:02,920 --> 00:48:04,020 But that could be risky. 967 00:48:04,020 --> 00:48:08,480 Email is the only one I might claim a unique constraint is pretty reasonable, 968 00:48:08,480 --> 00:48:11,070 especially if your website-- as is commonly 969 00:48:11,070 --> 00:48:13,960 the case these days-- is going to expect users to log in, 970 00:48:13,960 --> 00:48:15,680 certainly not with their id. 971 00:48:15,680 --> 00:48:17,140 And I see no mention of username. 972 00:48:17,140 --> 00:48:19,598 If they're going to use their email address as their login, 973 00:48:19,598 --> 00:48:22,210 then you better only allow any email address 974 00:48:22,210 --> 00:48:25,400 to be registered once and only once. 975 00:48:25,400 --> 00:48:26,810 But you know what I'm not liking? 976 00:48:26,810 --> 00:48:30,290 And let me make this column even wider. 977 00:48:30,290 --> 00:48:33,452 What's kind of dumb about this design at the moment? 978 00:48:33,452 --> 00:48:35,410 You know, before we even get ahead of ourselves 979 00:48:35,410 --> 00:48:41,555 and start moving this into our SQL database, what's dumb here? 980 00:48:41,555 --> 00:48:44,670 981 00:48:44,670 --> 00:48:46,640 I feel like some redundancy, right? 982 00:48:46,640 --> 00:48:49,340 Like 1 Oxford Street, that's special and unique. 983 00:48:49,340 --> 00:48:51,280 33 Oxford Street, that's special and unique, 984 00:48:51,280 --> 00:48:54,071 even though there could be multiple people living or working there. 985 00:48:54,071 --> 00:48:55,820 51 Prospect Street, same thing there. 986 00:48:55,820 --> 00:48:59,020 But like Cambridge, Mass 02138 USA. 987 00:48:59,020 --> 00:49:01,740 Cambridge, Mass 02138 USA. 988 00:49:01,740 --> 00:49:05,050 Like why for every resident of Cambridge, 989 00:49:05,050 --> 00:49:09,070 Massachusetts am I storing Cambridge, Mass 08138 USA. 990 00:49:09,070 --> 00:49:12,060 Cambridge, Mass 02138 USA. 991 00:49:12,060 --> 00:49:14,110 This would seem redundant. 992 00:49:14,110 --> 00:49:17,010 I mean it's not likely that Cambridge is going to change its name. 993 00:49:17,010 --> 00:49:19,509 I mean, once upon a time it was Newtown, now it's Cambridge. 994 00:49:19,509 --> 00:49:21,900 But that's not likely to happen again anytime soon. 995 00:49:21,900 --> 00:49:27,450 So it's not so much anticipating change, but just look at all these darn bytes 996 00:49:27,450 --> 00:49:29,120 that I'm storing redundantly. 997 00:49:29,120 --> 00:49:31,100 Which of those bytes though should hopefully 998 00:49:31,100 --> 00:49:36,210 be sufficient to identify where someone lives or works? 999 00:49:36,210 --> 00:49:39,422 At least in the US, we tend to use our postal codes pretty readily. 1000 00:49:39,422 --> 00:49:40,130 So you know what? 1001 00:49:40,130 --> 00:49:43,620 Let me go ahead and create a field called zip code. 1002 00:49:43,620 --> 00:49:45,880 Why don't I just store 02138 here. 1003 00:49:45,880 --> 00:49:48,880 And oh, stupid Google Spreadsheets, notice what it did. 1004 00:49:48,880 --> 00:49:52,560 It is assuming, because something looks like an integer, that it is an integer. 1005 00:49:52,560 --> 00:49:56,790 So this is a stupid artifact of using a spreadsheet program. 1006 00:49:56,790 --> 00:49:59,490 Let me change that to plaintext and retype it. 1007 00:49:59,490 --> 00:50:01,610 Now Google will respect my input. 1008 00:50:01,610 --> 00:50:05,660 But the point is all the more clear, we had better in our SQL database 1009 00:50:05,660 --> 00:50:07,890 call that text and not an integer. 1010 00:50:07,890 --> 00:50:09,830 And now you know what? 1011 00:50:09,830 --> 00:50:13,450 I can make my address really just a street address. 1012 00:50:13,450 --> 00:50:17,880 And I can get rid of this, and oh, I shouldn't get rid of this. 1013 00:50:17,880 --> 00:50:25,120 But I should in this column do 06511, and then here, get rid of all of that. 1014 00:50:25,120 --> 00:50:26,970 Damn Google again. 1015 00:50:26,970 --> 00:50:30,660 Let's go ahead and change that to plaintext. 1016 00:50:30,660 --> 00:50:33,530 05611, and change this to 02138. 1017 00:50:33,530 --> 00:50:35,310 So it's a little cleaner now. 1018 00:50:35,310 --> 00:50:37,050 There's a little bit of duplication. 1019 00:50:37,050 --> 00:50:41,840 I now have 02138 twice in my database-- or in my spreadsheet in this case. 1020 00:50:41,840 --> 00:50:44,330 But at least it's a lot, lot, lot less redundant. 1021 00:50:44,330 --> 00:50:46,410 But I need to recover that information. 1022 00:50:46,410 --> 00:50:49,787 This is just my Users spreadsheet-- my Users table. 1023 00:50:49,787 --> 00:50:50,370 You know what? 1024 00:50:50,370 --> 00:50:53,730 I'm going to go ahead and create another sheet with Google Spreadsheets here. 1025 00:50:53,730 --> 00:50:55,760 I'm going to go ahead and call this Zip Code. 1026 00:50:55,760 --> 00:50:58,927 I'm going to call this City, State, Country. 1027 00:50:58,927 --> 00:51:01,260 Although Country makes things a little more complicated, 1028 00:51:01,260 --> 00:51:04,140 because zip codes aren't going to be the same format everywhere. 1029 00:51:04,140 --> 00:51:08,480 But for now, 02138 is going to be-- dammit. 1030 00:51:08,480 --> 00:51:12,810 Let's change this whole column this time to plain text. 1031 00:51:12,810 --> 00:51:17,540 02137, Cambridge, Massachusetts, USA. 1032 00:51:17,540 --> 00:51:21,670 And now for New Haven, let's do the same thing. 1033 00:51:21,670 --> 00:51:26,080 Paste that, New Haven, Connecticut, USA. 1034 00:51:26,080 --> 00:51:29,060 And now rather than just call the Sheet Three, 1035 00:51:29,060 --> 00:51:32,220 let me more intelligently call this like Zip Codes. 1036 00:51:32,220 --> 00:51:33,920 And you know what? 1037 00:51:33,920 --> 00:51:36,520 I'm not loving how many bytes I'm using to store Zip Code. 1038 00:51:36,520 --> 00:51:41,320 Why don't I learn from lessons past and give this spreadsheet 1039 00:51:41,320 --> 00:51:46,210 or this column or rather this table its own ID, 1040 00:51:46,210 --> 00:51:49,310 and just arbitrarily assume that someone will number those for me. 1041 00:51:49,310 --> 00:51:54,650 So now I can whittle this down to just numbers where 1042 00:51:54,650 --> 00:51:56,680 this should be an integer now. 1043 00:51:56,680 --> 00:52:01,330 And in my Zip Codes table, let me make room for just these little annotations. 1044 00:52:01,330 --> 00:52:02,627 This shall be an integer. 1045 00:52:02,627 --> 00:52:04,710 Zip code we've already learned had better be text. 1046 00:52:04,710 --> 00:52:05,640 This should be text. 1047 00:52:05,640 --> 00:52:07,020 This should be text. 1048 00:52:07,020 --> 00:52:09,480 This should be text as well. 1049 00:52:09,480 --> 00:52:13,000 All right, so I have taken a very simple idea, spreadsheet 1050 00:52:13,000 --> 00:52:17,480 with all of my customers, and I seem to have really over-engineered it. 1051 00:52:17,480 --> 00:52:19,470 I've made something simple more complex. 1052 00:52:19,470 --> 00:52:20,259 But why? 1053 00:52:20,259 --> 00:52:22,800 I mean, it's not compelling when I have just three customers. 1054 00:52:22,800 --> 00:52:24,591 It took me more time to describe what I was 1055 00:52:24,591 --> 00:52:27,760 doing than just do it the original way, albeit with the redundancy. 1056 00:52:27,760 --> 00:52:31,215 But if you do start to have tens of rows or hundreds of rows or thousands 1057 00:52:31,215 --> 00:52:33,340 or tens of thousands, having all of that duplicated 1058 00:52:33,340 --> 00:52:34,450 data just doesn't make sense. 1059 00:52:34,450 --> 00:52:35,790 You're throwing away disk space. 1060 00:52:35,790 --> 00:52:38,290 You're throwing away potential performance and running time. 1061 00:52:38,290 --> 00:52:42,120 You're throwing away money, if you have to buy more disk space for your server. 1062 00:52:42,120 --> 00:52:45,700 And so what I've done here is what we would call normalizing my database. 1063 00:52:45,700 --> 00:52:51,670 Factoring out the commonalities that can be uniquely identified more simply, 1064 00:52:51,670 --> 00:52:54,880 with this case a zip code, or even more simply, 1065 00:52:54,880 --> 00:52:59,330 via some arbitrary but some consistent number that I impose. 1066 00:52:59,330 --> 00:53:02,652 So let's go ahead now and port this over to my database. 1067 00:53:02,652 --> 00:53:05,110 I'm going to go ahead and use phpLiteAdmin, just because it 1068 00:53:05,110 --> 00:53:06,645 makes things nice and easy here. 1069 00:53:06,645 --> 00:53:08,520 I'm going to go ahead and drop my Registrants 1070 00:53:08,520 --> 00:53:10,260 table, because that story is over. 1071 00:53:10,260 --> 00:53:12,940 And now let me go ahead and create a couple of tables. 1072 00:53:12,940 --> 00:53:16,650 The first of which I'm going to go ahead and call my-- what did I call it? 1073 00:53:16,650 --> 00:53:18,280 Users. 1074 00:53:18,280 --> 00:53:22,120 And number of fields, I'm going to need one, two, three, four, five, 1075 00:53:22,120 --> 00:53:23,880 six fields. 1076 00:53:23,880 --> 00:53:25,860 So let's go ahead and do that. 1077 00:53:25,860 --> 00:53:33,660 So six fields, and those are going to be id, name, address, and what did we say? 1078 00:53:33,660 --> 00:53:40,600 Zip code, and phone, and email. 1079 00:53:40,600 --> 00:53:42,110 And this shall be an integer. 1080 00:53:42,110 --> 00:53:43,060 This shall be text. 1081 00:53:43,060 --> 00:53:44,050 This shall be text. 1082 00:53:44,050 --> 00:53:44,970 This shall be text. 1083 00:53:44,970 --> 00:53:46,870 This shall be text. 1084 00:53:46,870 --> 00:53:48,470 This shall be text. 1085 00:53:48,470 --> 00:53:50,270 But id shall be a primary key. 1086 00:53:50,270 --> 00:53:51,701 I want it to auto increment. 1087 00:53:51,701 --> 00:53:52,450 And you know what? 1088 00:53:52,450 --> 00:53:55,190 I don't want names to be null. 1089 00:53:55,190 --> 00:53:57,630 I'm OK with addresses being null if they don't 1090 00:53:57,630 --> 00:54:01,060 want to cooperate or zip code being null if they don't want to cooperate. 1091 00:54:01,060 --> 00:54:03,240 Phone, email should not be null. 1092 00:54:03,240 --> 00:54:06,170 So I'm going to go ahead now and click Create. 1093 00:54:06,170 --> 00:54:09,780 This is the query-- somewhat longer now-- that was executed. 1094 00:54:09,780 --> 00:54:11,870 And now let me go ahead and create another table 1095 00:54:11,870 --> 00:54:15,530 by clicking on the name of my database again, creating a new table called 1096 00:54:15,530 --> 00:54:16,580 Zip Codes. 1097 00:54:16,580 --> 00:54:21,820 Number of fields here is going to be one, two, three, four, five. 1098 00:54:21,820 --> 00:54:28,670 So five fields, one of which is id, one of which is the zip code. 1099 00:54:28,670 --> 00:54:32,710 Then city, then state, then country. 1100 00:54:32,710 --> 00:54:38,140 Text, text, text, text, integer. 1101 00:54:38,140 --> 00:54:39,790 This shall be primary key. 1102 00:54:39,790 --> 00:54:42,100 This also shall be auto incremented. 1103 00:54:42,100 --> 00:54:44,670 Zip code cannot be null. 1104 00:54:44,670 --> 00:54:46,730 And everything else I'm OK with being null. 1105 00:54:46,730 --> 00:54:49,120 So I can also not spell country. 1106 00:54:49,120 --> 00:54:50,590 Create. 1107 00:54:50,590 --> 00:54:53,450 And so now, let's toss in some data. 1108 00:54:53,450 --> 00:54:57,970 Let me go ahead and insert really quickly, I'm going to use the web form, 1109 00:54:57,970 --> 00:55:04,130 zip code of 02138, Cambridge, Massachusetts, USA. 1110 00:55:04,130 --> 00:55:05,220 Insert. 1111 00:55:05,220 --> 00:55:07,300 Let me go ahead and insert New Haven as well. 1112 00:55:07,300 --> 00:55:10,050 So 06511. 1113 00:55:10,050 --> 00:55:12,180 05611, yep. 1114 00:55:12,180 --> 00:55:13,000 05611. 1115 00:55:13,000 --> 00:55:17,640 New Haven, Connecticut, USA. 1116 00:55:17,640 --> 00:55:19,800 And now let me just give myself a few customers. 1117 00:55:19,800 --> 00:55:24,727 Let me go in to my Users table, insert David-- or no, who did we have? 1118 00:55:24,727 --> 00:55:26,310 We'll copy and paste our actual users. 1119 00:55:26,310 --> 00:55:32,280 Alice, so we had Alice from 1 Oxford Street in zip code now. 1120 00:55:32,280 --> 00:55:34,280 But this zip code-- oh, interesting. 1121 00:55:34,280 --> 00:55:35,440 I goofed. 1122 00:55:35,440 --> 00:55:36,490 I went a little too fast. 1123 00:55:36,490 --> 00:55:37,570 So let me abort. 1124 00:55:37,570 --> 00:55:39,050 Let me go to Structure. 1125 00:55:39,050 --> 00:55:42,390 Zip code I said was text, but you know what? 1126 00:55:42,390 --> 00:55:45,530 I'm going to change that to integer. 1127 00:55:45,530 --> 00:55:49,160 All right, and now let me go over to the Insert tab again. 1128 00:55:49,160 --> 00:55:55,030 Let's put Alice from 1 Oxford Street at location 1, 1129 00:55:55,030 --> 00:55:57,710 with her phone number, which was for 495-5000. 1130 00:55:57,710 --> 00:55:58,940 And you can do this. 1131 00:55:58,940 --> 00:56:01,320 Even though this looks even more like an integer, 1132 00:56:01,320 --> 00:56:03,320 a common mistake I think in a database is 1133 00:56:03,320 --> 00:56:06,180 to just blindly throw in whatever the user typed in, 1134 00:56:06,180 --> 00:56:09,780 even if he or she used parentheses or pluses or dashes 1135 00:56:09,780 --> 00:56:11,834 or any number of other punctuation symbols. 1136 00:56:11,834 --> 00:56:13,750 There's no reason for us to store all of that. 1137 00:56:13,750 --> 00:56:16,510 In fact, if we want to pretty things up and throw away 1138 00:56:16,510 --> 00:56:19,050 some of the inconsistencies, I could just 1139 00:56:19,050 --> 00:56:22,830 use code-- Python, ultimately-- throw away all punctuation symbols, 1140 00:56:22,830 --> 00:56:26,300 and then just store 10 characters in the case of a US number, 1141 00:56:26,300 --> 00:56:27,800 or whatever pattern I care about. 1142 00:56:27,800 --> 00:56:30,950 Or I could proactively with Python even pretty this 1143 00:56:30,950 --> 00:56:33,230 up by just putting the dashes where I want them to be, 1144 00:56:33,230 --> 00:56:35,240 not where the user necessarily put them. 1145 00:56:35,240 --> 00:56:37,130 And then Alice was alice@example.com. 1146 00:56:37,130 --> 00:56:43,870 And now let me really quickly go ahead and insert Bob, who was from 33 1147 00:56:43,870 --> 00:56:45,350 Oxford Street. 1148 00:56:45,350 --> 00:56:47,920 He's also in zip code 1 now. 1149 00:56:47,920 --> 00:56:50,870 His phone number was 617-495-9000. 1150 00:56:50,870 --> 00:56:54,370 And I'll go ahead and pretend like I'm formatting this in code. 1151 00:56:54,370 --> 00:56:56,420 And then he was bob@example.com. 1152 00:56:56,420 --> 00:57:01,650 And then lastly, let's go ahead and insert Charlie 1153 00:57:01,650 --> 00:57:07,070 from 51 Prospect Street in New Haven, whose id I now know to be 2. 1154 00:57:07,070 --> 00:57:09,880 617-000. 1155 00:57:09,880 --> 00:57:11,540 He was the one that didn't cooperate. 1156 00:57:11,540 --> 00:57:12,540 And charlie@example.com. 1157 00:57:12,540 --> 00:57:15,590 1158 00:57:15,590 --> 00:57:17,940 All right, so what does this now mean? 1159 00:57:17,940 --> 00:57:21,410 In my database called Lecture, I now have two tables-- 1160 00:57:21,410 --> 00:57:26,540 Users and Zip Codes, each of which has some number of fields inside of them. 1161 00:57:26,540 --> 00:57:29,010 But this is not all that useful to me anymore, 1162 00:57:29,010 --> 00:57:33,230 because now if I go ahead and select my users with a query like this, 1163 00:57:33,230 --> 00:57:39,440 select star from users, go, I get back this. 1164 00:57:39,440 --> 00:57:42,890 But what the heck is zip code 1 or zip code 2? 1165 00:57:42,890 --> 00:57:44,940 Now OK, I can figure this out. 1166 00:57:44,940 --> 00:57:48,740 If I know that I'm looking for Alice, and I see that she's in zip code 1, 1167 00:57:48,740 --> 00:57:50,600 I could do another query. 1168 00:57:50,600 --> 00:57:53,110 All right, let me go back to that SQL tab. 1169 00:57:53,110 --> 00:57:59,970 Let me go ahead and do select star from Zip Codes where id equals 1. 1170 00:57:59,970 --> 00:58:03,960 Because I saw that Alice's zip code was uniquely identified by 1. 1171 00:58:03,960 --> 00:58:04,760 Oh, all right. 1172 00:58:04,760 --> 00:58:07,000 So Alice lives in zip code 1. 1173 00:58:07,000 --> 00:58:08,960 Yeah, that's 02138 Cambridge, Mass, USA. 1174 00:58:08,960 --> 00:58:10,090 OK, got that. 1175 00:58:10,090 --> 00:58:12,111 But now I've used two SQL queries. 1176 00:58:12,111 --> 00:58:15,110 The data is still like-- some of it's over here, some of it's over here. 1177 00:58:15,110 --> 00:58:17,160 Now I've got to somehow combine it in code. 1178 00:58:17,160 --> 00:58:21,120 And you can do that, but SQL is much more powerful than that. 1179 00:58:21,120 --> 00:58:23,780 It has other keywords like this, where I can have 1180 00:58:23,780 --> 00:58:25,520 the database do the thinking for me. 1181 00:58:25,520 --> 00:58:26,290 Let me do this. 1182 00:58:26,290 --> 00:58:35,980 Select star from users, join Zip Codes on users.zipCode equals zipCodes.id. 1183 00:58:35,980 --> 00:58:40,020 1184 00:58:40,020 --> 00:58:42,570 Take a moment to think, and ignore the red squigglies. 1185 00:58:42,570 --> 00:58:47,430 That just my browser thinking I'm being grammatically bad, verbally too. 1186 00:58:47,430 --> 00:58:50,490 Select star from users, join zip code. 1187 00:58:50,490 --> 00:58:53,700 So I'm telling the database, go ahead and join these two tables. 1188 00:58:53,700 --> 00:58:54,546 How? 1189 00:58:54,546 --> 00:58:56,920 Well if you think of one table as this hand and one table 1190 00:58:56,920 --> 00:58:59,520 as this hand, what's nice is that each of them 1191 00:58:59,520 --> 00:59:01,820 has a field that's inside of the other. 1192 00:59:01,820 --> 00:59:03,670 This table might have the id field. 1193 00:59:03,670 --> 00:59:06,110 This table has the zip code field. 1194 00:59:06,110 --> 00:59:09,040 Wouldn't it be nice if I could somehow stitch those together, 1195 00:59:09,040 --> 00:59:12,920 lining up those zip codes, whereby in this table, Users-- 1196 00:59:12,920 --> 00:59:18,800 I called it zip code-- and then in my Zip Codes table I called it id. 1197 00:59:18,800 --> 00:59:19,790 Let's see what happens. 1198 00:59:19,790 --> 00:59:23,300 Let me go ahead and click Go. 1199 00:59:23,300 --> 00:59:26,030 And amazingly, look at what I get back. 1200 00:59:26,030 --> 00:59:30,020 My result set this time contains everything. 1201 00:59:30,020 --> 00:59:34,830 I get Alice, Bob, and Charlie, each of whom lives at these addresses. 1202 00:59:34,830 --> 00:59:36,460 But look at their zip codes now. 1203 00:59:36,460 --> 00:59:38,780 They've been filled in with the actual values, 1204 00:59:38,780 --> 00:59:42,800 plus the cities and the states and the country. 1205 00:59:42,800 --> 00:59:45,080 So this is where you really now start to scratch 1206 00:59:45,080 --> 00:59:46,990 the surface of the capabilities of something 1207 00:59:46,990 --> 00:59:50,350 like SQL, because it can actually combine data in this way. 1208 00:59:50,350 --> 00:59:52,120 And this was a pretty simple query. 1209 00:59:52,120 --> 00:59:55,460 But now I can sort of exercise good design. 1210 00:59:55,460 --> 00:59:59,340 I can keep my data very cleanly structured and normalized, whereby 1211 00:59:59,340 --> 01:00:01,210 I factored out all of the redundancies. 1212 01:00:01,210 --> 01:00:04,840 And yet I can programmatically reconstruct that data 1213 01:00:04,840 --> 01:00:08,420 and ensure that I can get back everything I care about. 1214 01:00:08,420 --> 01:00:10,960 Moreover, I can further optimize things. 1215 01:00:10,960 --> 01:00:14,850 If I go into Users for instance and I go under Structure, 1216 01:00:14,850 --> 01:00:16,920 notice down here there's another field. 1217 01:00:16,920 --> 01:00:19,700 Create index on one column or more. 1218 01:00:19,700 --> 01:00:22,800 Let me go ahead and create that index on my Users table. 1219 01:00:22,800 --> 01:00:25,640 I'm going to call it Arbitrarily Email, and I'm 1220 01:00:25,640 --> 01:00:28,349 going to say duplicate values are not allowed. 1221 01:00:28,349 --> 01:00:30,390 I'm not going to bother with a where clause here, 1222 01:00:30,390 --> 01:00:35,130 but I'm going to put this constraint-- this unique index on my email field, 1223 01:00:35,130 --> 01:00:36,747 and now click Create. 1224 01:00:36,747 --> 01:00:37,955 And notice what got executed. 1225 01:00:37,955 --> 01:00:40,900 1226 01:00:40,900 --> 01:00:43,130 Earlier we used it to create a table. 1227 01:00:43,130 --> 01:00:46,450 Here I'm using it to create a unique index called Emait-- 1228 01:00:46,450 --> 01:00:49,140 though I could've called it whatever-- on the Users table 1229 01:00:49,140 --> 01:00:50,900 using specifically this field. 1230 01:00:50,900 --> 01:00:55,910 So this now is my way of ensuring that the following can't happen. 1231 01:00:55,910 --> 01:01:00,690 Notice that in my Users table I've got Alice with alice@example.com. 1232 01:01:00,690 --> 01:01:06,420 suppose that someone else named Alice, also with that same email address-- 1233 01:01:06,420 --> 01:01:08,540 or who thinks her email address is the same, 1234 01:01:08,540 --> 01:01:11,880 because of whatever typographical error-- comes along, 1235 01:01:11,880 --> 01:01:15,640 and I try to register them for my site as follows. 1236 01:01:15,640 --> 01:01:19,500 Insert into Users. 1237 01:01:19,500 --> 01:01:22,620 Let's see, we have name, what are the other fields now? 1238 01:01:22,620 --> 01:01:30,960 Name, address, zip code, phone, and email. 1239 01:01:30,960 --> 01:01:36,610 The values as follows Alice number two, so her name's not distinct. 1240 01:01:36,610 --> 01:01:41,500 Her address is going to be 1 Main Street, so her address is not the same. 1241 01:01:41,500 --> 01:01:47,250 Zip code, she'll go ahead and live in New Haven, for instance, so 2. 1242 01:01:47,250 --> 01:01:50,670 Phone we don't need to worry about, so she'll be another fake phone 1243 01:01:50,670 --> 01:01:52,070 number for today's purposes. 1244 01:01:52,070 --> 01:01:56,480 But suppose that she also thinks her address is alice@example.com 1245 01:01:56,480 --> 01:01:59,110 Or heck, maybe this is the same Alice who 1246 01:01:59,110 --> 01:02:02,000 forgot she has an account on our website-- maybe more likely-- 1247 01:02:02,000 --> 01:02:05,140 and is trying to reregister with the same address. 1248 01:02:05,140 --> 01:02:06,940 What's going to happen? 1249 01:02:06,940 --> 01:02:10,750 If I didn't make any typos here, let's click Go. 1250 01:02:10,750 --> 01:02:11,930 Interesting. 1251 01:02:11,930 --> 01:02:14,110 The database stopped me from doing this. 1252 01:02:14,110 --> 01:02:18,340 Error, unique constraint failed, users.email. 1253 01:02:18,340 --> 01:02:20,450 And this is common syntax in the SQL world. 1254 01:02:20,450 --> 01:02:22,260 tableName.fieldName. 1255 01:02:22,260 --> 01:02:24,530 So insert into Users, whatever I just typed 1256 01:02:24,530 --> 01:02:26,260 failed, because the unique constraint. 1257 01:02:26,260 --> 01:02:27,510 And indeed, we can see this. 1258 01:02:27,510 --> 01:02:30,220 Let me go to Browse and look. 1259 01:02:30,220 --> 01:02:31,880 There is no second Alice. 1260 01:02:31,880 --> 01:02:34,500 So we could do this in code, as we'll soon 1261 01:02:34,500 --> 01:02:36,470 see even more in the world of Python. 1262 01:02:36,470 --> 01:02:39,670 You could check by writing your Python code to see, wait a minute, 1263 01:02:39,670 --> 01:02:41,800 someone's trying to register as alice@example.com. 1264 01:02:41,800 --> 01:02:45,780 Let me quick select all my users from my database, look for Alice, 1265 01:02:45,780 --> 01:02:47,660 and if I already have Alice at example.com, 1266 01:02:47,660 --> 01:02:49,540 I'll just say, no, you can't register. 1267 01:02:49,540 --> 01:02:51,750 Much like I said no you can't register for Frash IMs 1268 01:02:51,750 --> 01:02:54,390 if you don't have a name or a dorm inputed. 1269 01:02:54,390 --> 01:02:56,790 But my database can do that for me. 1270 01:02:56,790 --> 01:03:00,600 And this is a nice wall between me and my database, 1271 01:03:00,600 --> 01:03:02,490 a nice wall between the software developers 1272 01:03:02,490 --> 01:03:04,990 and even the database administrators so that you 1273 01:03:04,990 --> 01:03:08,230 don't have to worry about data accidentally 1274 01:03:08,230 --> 01:03:12,330 getting into your website or your database that shouldn't actually 1275 01:03:12,330 --> 01:03:14,540 be there. 1276 01:03:14,540 --> 01:03:16,840 All right, we have all of these building blocks now. 1277 01:03:16,840 --> 01:03:20,710 We have Select and Insert and Delete and Update and now Join. 1278 01:03:20,710 --> 01:03:22,020 What more is there? 1279 01:03:22,020 --> 01:03:23,910 Well, there are some more keywords. 1280 01:03:23,910 --> 01:03:25,680 But let's first use the ones we have. 1281 01:03:25,680 --> 01:03:29,042 But this time, not just to play around via the web-based tool. 1282 01:03:29,042 --> 01:03:30,750 Let me actually show you one other thing. 1283 01:03:30,750 --> 01:03:34,820 So let me go in to CS50 IDE and take a look at my source 9 directory, 1284 01:03:34,820 --> 01:03:36,861 and you'll see these four folders, two of which 1285 01:03:36,861 --> 01:03:38,360 you're about to look at in a moment. 1286 01:03:38,360 --> 01:03:42,180 And you'll also see lecture.db, a file that I've 1287 01:03:42,180 --> 01:03:46,520 been using to store all of my SQLite data, including those tables. 1288 01:03:46,520 --> 01:03:51,660 Well, it turns out that we can actually use a command line client in order 1289 01:03:51,660 --> 01:03:52,670 to see the same data. 1290 01:03:52,670 --> 01:03:57,060 Let me go ahead and do that by typing SQLite 3 for version 3. 1291 01:03:57,060 --> 01:04:02,100 And then let me go ahead and type in the name of this file, lecture.db Enter. 1292 01:04:02,100 --> 01:04:05,490 And now you'll see a very simple and perhaps a little cryptic command line 1293 01:04:05,490 --> 01:04:06,070 interface. 1294 01:04:06,070 --> 01:04:08,086 But thankfully if you type as it says .help, 1295 01:04:08,086 --> 01:04:10,960 you'll see everything that you can possibly do with this command line 1296 01:04:10,960 --> 01:04:11,380 program. 1297 01:04:11,380 --> 01:04:13,338 But we're going to go ahead and keep it simple. 1298 01:04:13,338 --> 01:04:15,730 I'm going to go ahead and just say .tables, 1299 01:04:15,730 --> 01:04:21,090 which is SQLite's way of saying show me what tables are inside of this file 1300 01:04:21,090 --> 01:04:23,130 called lecture.db. 1301 01:04:23,130 --> 01:04:25,510 Now here I have users. 1302 01:04:25,510 --> 01:04:29,240 Let me go ahead and see what is the schema of these tables. 1303 01:04:29,240 --> 01:04:33,550 And you'll see the SQL commands with which the Users table and the Zip Code 1304 01:04:33,550 --> 01:04:35,070 tables were created. 1305 01:04:35,070 --> 01:04:37,640 But more interesting than that is this. 1306 01:04:37,640 --> 01:04:39,820 At the command line, this is just a SQL client. 1307 01:04:39,820 --> 01:04:43,771 I've been using the web-based GUI, phpLiteAdmin for the past few examples. 1308 01:04:43,771 --> 01:04:45,770 But we can just do this at the command line too. 1309 01:04:45,770 --> 01:04:50,610 Let me go ahead and select star from users semicolon, 1310 01:04:50,610 --> 01:04:53,860 and now you see in purely textual form the exact same data. 1311 01:04:53,860 --> 01:04:57,420 Here's the ids, my users name addresses, zip codes, phone, emails. 1312 01:04:57,420 --> 01:04:59,510 I can see the same thing from my zip code fields. 1313 01:04:59,510 --> 01:05:06,099 Let me go ahead and do select star from Zip Codes. 1314 01:05:06,099 --> 01:05:07,390 And there we see the same data. 1315 01:05:07,390 --> 01:05:10,220 So using the command line too, you don't have to even use phpLiteAdmin. 1316 01:05:10,220 --> 01:05:12,220 It just tends to be a little more user friendly. 1317 01:05:12,220 --> 01:05:15,360 You can also see the same data in the same file. 1318 01:05:15,360 --> 01:05:18,660 And now that I've done this purely manually, 1319 01:05:18,660 --> 01:05:21,770 let's now transition to doing this in Python code. 1320 01:05:21,770 --> 01:05:24,490 But first, let me make mention of one detail in the tables 1321 01:05:24,490 --> 01:05:25,760 that we had created. 1322 01:05:25,760 --> 01:05:27,620 Back here in users, you'll recall that we 1323 01:05:27,620 --> 01:05:30,500 had the structure of having an id and a name 1324 01:05:30,500 --> 01:05:33,945 and the address, zip code, phone, and email. 1325 01:05:33,945 --> 01:05:36,320 What you'll actually find is that the world has generally 1326 01:05:36,320 --> 01:05:41,250 standardized how you define what are called foreign keys in these tables. 1327 01:05:41,250 --> 01:05:45,270 So specifically, which of the fields in this table 1328 01:05:45,270 --> 01:05:47,340 are kind of foreign to this table? 1329 01:05:47,340 --> 01:05:48,390 That one of zip code. 1330 01:05:48,390 --> 01:05:51,270 Originally, that zip code was a text field, 02138, 1331 01:05:51,270 --> 01:05:54,900 and any number of other zip codes as well. 1332 01:05:54,900 --> 01:05:58,950 But then I changed it to an integer, because it really in this table 1333 01:05:58,950 --> 01:06:00,420 is a foreign key. 1334 01:06:00,420 --> 01:06:03,280 Because those same numbers 1, 2, and onward 1335 01:06:03,280 --> 01:06:09,760 are primary keys in that other table called Zip Codes. 1336 01:06:09,760 --> 01:06:12,690 And so what would actually be more typical here 1337 01:06:12,690 --> 01:06:19,480 is that we would often name this not Zip Code, which is a little ambiguous. 1338 01:06:19,480 --> 01:06:21,590 But just as a human convention, let me propose 1339 01:06:21,590 --> 01:06:23,298 that we clean up this design a little bit 1340 01:06:23,298 --> 01:06:27,600 and call this Zip Code ID to make super clear that this is indeed an ID. 1341 01:06:27,600 --> 01:06:31,790 It's not this table's ID, it's an ID from the Zip Codes table. 1342 01:06:31,790 --> 01:06:35,220 Plural would be the convention, and singular would be the convention here. 1343 01:06:35,220 --> 01:06:37,870 And indeed, we can now use the same kind of statements in code. 1344 01:06:37,870 --> 01:06:40,390 But it turns out with one of our concluding examples today, 1345 01:06:40,390 --> 01:06:43,620 we'll see while it's advantageous to adhere to certain conventions, 1346 01:06:43,620 --> 01:06:47,350 and frankly even this might even be typically higher in the table as an id 1347 01:06:47,350 --> 01:06:49,830 and not just buried there in the middle. 1348 01:06:49,830 --> 01:06:51,160 But that's a more minor detail. 1349 01:06:51,160 --> 01:06:56,770 All right, now let's return to CS50 IDE and build upon those previous Frosh IMs 1350 01:06:56,770 --> 01:06:57,530 example. 1351 01:06:57,530 --> 01:07:00,490 But now weave in this new feature that we have, 1352 01:07:00,490 --> 01:07:05,030 which is that ability to store data and retrieve data from a database. 1353 01:07:05,030 --> 01:07:08,480 So recall that this was Frosh IMs 1, which 1354 01:07:08,480 --> 01:07:12,000 was an improvement upon Frosh IM 0 in that at least Frosh IMs 1 actually 1355 01:07:12,000 --> 01:07:13,610 kind of sort of registered students. 1356 01:07:13,610 --> 01:07:17,460 It used registrants.csv, and it wrote out the data to a simple text file. 1357 01:07:17,460 --> 01:07:20,010 But with text files, CSV files, even though you could double 1358 01:07:20,010 --> 01:07:22,370 click them and open them again in Numbers, Excel, and port 1359 01:07:22,370 --> 01:07:25,578 them into Google Spreadsheets, we don't have the same expressive capabilities 1360 01:07:25,578 --> 01:07:29,530 as we do with a language like SQL to select, insert, update, delete. 1361 01:07:29,530 --> 01:07:30,530 They're just text files. 1362 01:07:30,530 --> 01:07:34,940 With SQLite though, a binary file, and the SQLite 3 program, 1363 01:07:34,940 --> 01:07:37,280 and we'll soon see Python code, we can actually 1364 01:07:37,280 --> 01:07:42,920 execute more sophisticated queries than the CSV format allowed. 1365 01:07:42,920 --> 01:07:44,340 So how are we going to do this? 1366 01:07:44,340 --> 01:07:50,640 Let me propose now in Frosh IMs 2, we create a new application.py 1367 01:07:50,640 --> 01:07:53,220 that's going to behave as follows. 1368 01:07:53,220 --> 01:07:56,280 Let's go ahead and propose that I'm going 1369 01:07:56,280 --> 01:07:59,990 to import as before some of the Flask functionality. 1370 01:07:59,990 --> 01:08:04,120 So from Flask import capital Flask, which is the class that we're using 1371 01:08:04,120 --> 01:08:06,120 is the application itself. 1372 01:08:06,120 --> 01:08:08,440 Render template, so I can render HTML. 1373 01:08:08,440 --> 01:08:12,250 Redirect, so that if I want to redirect the user, 1374 01:08:12,250 --> 01:08:16,069 I can do this by an HTTP location redirect. 1375 01:08:16,069 --> 01:08:18,930 Request, so that I can actually get at form data. 1376 01:08:18,930 --> 01:08:22,100 And we'll see URL4, which you might recall from past problems. 1377 01:08:22,100 --> 01:08:22,850 But you know what? 1378 01:08:22,850 --> 01:08:28,290 I'm also going to import from CS50 our SQL wrapper. 1379 01:08:28,290 --> 01:08:31,029 So inside of the CS50 library recall for Python 1380 01:08:31,029 --> 01:08:34,680 is getInt and getString and getChar and getFloat. 1381 01:08:34,680 --> 01:08:38,720 But inside there also is a whole library called SQL. 1382 01:08:38,720 --> 01:08:41,830 So the CS50 SQL library that doesn't do all that much, 1383 01:08:41,830 --> 01:08:44,407 but it does offer us an execute function that's 1384 01:08:44,407 --> 01:08:46,490 going to allow us more simply than might otherwise 1385 01:08:46,490 --> 01:08:50,569 be possible to execute SQL statements inside 1386 01:08:50,569 --> 01:08:55,520 of our Python code against Lecture.db on FroshIMs.db 1387 01:08:55,520 --> 01:08:56,741 or whatever the file may be. 1388 01:08:56,741 --> 01:08:58,240 And I'm going to do that as follows. 1389 01:08:58,240 --> 01:09:02,050 First I'm going to instantiate my Flask application, as always. 1390 01:09:02,050 --> 01:09:05,310 And now I'm going to instantiate a database connection essentially 1391 01:09:05,310 --> 01:09:06,229 as follows. 1392 01:09:06,229 --> 01:09:11,029 SQL quote unquote SQLite colon, slash, slash, slash. 1393 01:09:11,029 --> 01:09:13,120 So notice the third slash in this case. 1394 01:09:13,120 --> 01:09:15,450 FroshIMs2.db. 1395 01:09:15,450 --> 01:09:17,979 So this will be Frosh IMs version 2. 1396 01:09:17,979 --> 01:09:18,939 And that's it. 1397 01:09:18,939 --> 01:09:23,660 So I'm telling the CS50 library to use SQLite and to open up ultimately 1398 01:09:23,660 --> 01:09:25,410 the Frosh IMs 2 database. 1399 01:09:25,410 --> 01:09:28,920 Now I'm going to go ahead and have a simple route as before for just slash. 1400 01:09:28,920 --> 01:09:31,170 I'm going to define a function called index as before, 1401 01:09:31,170 --> 01:09:32,580 and this one's super simple. 1402 01:09:32,580 --> 01:09:37,910 Just go ahead and return render template of index.html. 1403 01:09:37,910 --> 01:09:41,590 And now let's reimplement register as follows. 1404 01:09:41,590 --> 01:09:45,696 So app, route, slash register. 1405 01:09:45,696 --> 01:09:47,029 But in this case, you know what? 1406 01:09:47,029 --> 01:09:48,019 I want to support post. 1407 01:09:48,019 --> 01:09:50,310 I don't want students' information ending up in the URL 1408 01:09:50,310 --> 01:09:54,260 and like the computer lab's history or the roommate's browser or whatever. 1409 01:09:54,260 --> 01:09:57,600 So let's go ahead and specify to Flasks that you know what? 1410 01:09:57,600 --> 01:10:01,780 Only support this list of methods, specifically just post. 1411 01:10:01,780 --> 01:10:06,100 Let me go ahead now and define a method called Register. 1412 01:10:06,100 --> 01:10:08,010 That's going to map to this route. 1413 01:10:08,010 --> 01:10:13,270 And as before, if request.form quote unquote name equals equals nothing, 1414 01:10:13,270 --> 01:10:19,240 or request.form quote unquote dorm equals equals nothing, 1415 01:10:19,240 --> 01:10:26,090 then go ahead and return failure, and return render template of quote unquote 1416 01:10:26,090 --> 01:10:27,590 failure.html. 1417 01:10:27,590 --> 01:10:30,400 So the website is almost exactly the same 1418 01:10:30,400 --> 01:10:37,050 now, except we have preemptively added this database capability to our code 1419 01:10:37,050 --> 01:10:38,300 that we're now going to use. 1420 01:10:38,300 --> 01:10:41,110 Otherwise, if we're not in failure, let me 1421 01:10:41,110 --> 01:10:45,050 go ahead and execute the following SQL. 1422 01:10:45,050 --> 01:10:54,890 Inserts into registrants a name and a dorm with the values of-- we'll 1423 01:10:54,890 --> 01:10:56,240 come back to this in a moment. 1424 01:10:56,240 --> 01:10:57,640 Colon name colon dorm. 1425 01:10:57,640 --> 01:10:59,800 I could call these anything, but a convention 1426 01:10:59,800 --> 01:11:02,600 in a lot of SQL libraries-- a lot of code, 1427 01:11:02,600 --> 01:11:05,950 whether it's Python or PHP or Ruby or other languages-- 1428 01:11:05,950 --> 01:11:09,680 is to have these placeholder values with a colon and then a word that's probably 1429 01:11:09,680 --> 01:11:12,266 identical to the field name, but much like in C, 1430 01:11:12,266 --> 01:11:15,750 percent s has been placeholders, much like in Python open curly 1431 01:11:15,750 --> 01:11:17,760 brace close curly brace has been placeholders. 1432 01:11:17,760 --> 01:11:21,100 We've got one more placeholder convention, which is colon, and then 1433 01:11:21,100 --> 01:11:21,990 the name of a symbol. 1434 01:11:21,990 --> 01:11:24,200 But for the most part, they can almost always 1435 01:11:24,200 --> 01:11:26,640 be identical to what the field names are. 1436 01:11:26,640 --> 01:11:29,880 And now, I need to plug these values in. 1437 01:11:29,880 --> 01:11:34,090 So I'm going to go ahead and plug in for the first name equals request.form 1438 01:11:34,090 --> 01:11:35,250 name. 1439 01:11:35,250 --> 01:11:41,160 And dorm is going to equal request.form dorm. 1440 01:11:41,160 --> 01:11:43,850 So again, this is a Python thing meets SQL thing. 1441 01:11:43,850 --> 01:11:48,430 In our SQL library, as implemented by the CS50 execute function, 1442 01:11:48,430 --> 01:11:51,110 we have this SQL statement. 1443 01:11:51,110 --> 01:11:55,720 Insert into Registrants name, dorm, values, colon name colon dorm. 1444 01:11:55,720 --> 01:11:59,430 CS50's execute method, like many libraries out there, 1445 01:11:59,430 --> 01:12:02,990 will recognize any words that you have written in a SQL statement that 1446 01:12:02,990 --> 01:12:05,170 begin with a colon, and will then proceed 1447 01:12:05,170 --> 01:12:09,930 to substitute in any of the named parameters you provide thereafter. 1448 01:12:09,930 --> 01:12:15,780 So I'm saying, hey, library, give plugin a name of request.form name. 1449 01:12:15,780 --> 01:12:17,840 So whatever came from the user's HTTP request. 1450 01:12:17,840 --> 01:12:21,770 And for dorm, plug in whatever came in the dorm field as well. 1451 01:12:21,770 --> 01:12:25,060 So those were from the HTML forms that the user submitted in order 1452 01:12:25,060 --> 01:12:27,060 to register for this sport. 1453 01:12:27,060 --> 01:12:30,600 After they've done that, let's go ahead and do this. 1454 01:12:30,600 --> 01:12:35,470 Return, render, template, success. 1455 01:12:35,470 --> 01:12:38,970 And this time I'm going to change the success message to actually be, 1456 01:12:38,970 --> 01:12:42,350 yes, you really did indeed register, or the spirit thereof. 1457 01:12:42,350 --> 01:12:43,390 So let me do this. 1458 01:12:43,390 --> 01:12:46,120 Let me go into this Frosh IMs 2 directory. 1459 01:12:46,120 --> 01:12:48,450 And as you might have done for recent problems, 1460 01:12:48,450 --> 01:12:52,100 let me go ahead and run Flask run. 1461 01:12:52,100 --> 01:12:55,170 And we have since tweaked the configuration of the IDE 1462 01:12:55,170 --> 01:12:59,880 so that you no longer have to specify a host of 0.0.0.0 or port of 8080. 1463 01:12:59,880 --> 01:13:03,260 Those will just be assumed by default. You can override them if need be. 1464 01:13:03,260 --> 01:13:06,400 Now, I believe my application is running. 1465 01:13:06,400 --> 01:13:09,190 So let me go over to my web server. 1466 01:13:09,190 --> 01:13:11,440 And indeed, this is the form we saw last week. 1467 01:13:11,440 --> 01:13:14,140 Super simple, super ugly, super HTML. 1468 01:13:14,140 --> 01:13:17,370 Looks like HTML1, actually HTML5. 1469 01:13:17,370 --> 01:13:21,910 But let me go ahead now and prepare a database for this. 1470 01:13:21,910 --> 01:13:25,540 In another terminal window, I'm going to go ahead. 1471 01:13:25,540 --> 01:13:35,710 And in my source 9 directory as well, open up FroshIMs2.db. 1472 01:13:35,710 --> 01:13:42,050 Which if I open it up in phpLiteAdmin already looks like this, Frosh IMs 2. 1473 01:13:42,050 --> 01:13:43,990 But there's no table in there yet. 1474 01:13:43,990 --> 01:13:45,896 So I care only for now about names and dorms. 1475 01:13:45,896 --> 01:13:47,020 But I've learned my lesson. 1476 01:13:47,020 --> 01:13:48,480 I'm going to care about IDs too. 1477 01:13:48,480 --> 01:13:52,140 I'm going to go ahead and create registrants with three fields, 1478 01:13:52,140 --> 01:13:53,940 much like we conjectured earlier. 1479 01:13:53,940 --> 01:13:57,090 An ID, a name, and a dorm, first of which 1480 01:13:57,090 --> 01:13:59,980 shall be an integer, primary key, auto increment. 1481 01:13:59,980 --> 01:14:02,760 Next of which will be text, dorm of which will be text. 1482 01:14:02,760 --> 01:14:04,689 I don't want any of those to be null. 1483 01:14:04,689 --> 01:14:06,980 They don't need to be auto incremented or primary keys. 1484 01:14:06,980 --> 01:14:08,970 So let me go ahead and create. 1485 01:14:08,970 --> 01:14:12,660 If I return now to my Browse Structure, the table is empty. 1486 01:14:12,660 --> 01:14:15,480 Now let's go to the web-based application. 1487 01:14:15,480 --> 01:14:16,510 Here we go. 1488 01:14:16,510 --> 01:14:19,590 Let me go ahead and have a new name altogether. 1489 01:14:19,590 --> 01:14:21,890 Maria wants to register. 1490 01:14:21,890 --> 01:14:28,170 And Maria is from Stoughton. 1491 01:14:28,170 --> 01:14:29,469 Register. 1492 01:14:29,469 --> 01:14:30,260 You are registered. 1493 01:14:30,260 --> 01:14:30,789 Really? 1494 01:14:30,789 --> 01:14:33,830 So we've not invested heavily in the aesthetics of this website just yet, 1495 01:14:33,830 --> 01:14:38,420 but really I have been registered. 1496 01:14:38,420 --> 01:14:48,440 If I go back to phpLiteAdmin, click on Browse now, notice Maria from Stoughton 1497 01:14:48,440 --> 01:14:50,020 is actually now in my database. 1498 01:14:50,020 --> 01:14:51,250 Wait a minute, maybe I'm kind of cheating. 1499 01:14:51,250 --> 01:14:52,750 Maybe she was already there somehow. 1500 01:14:52,750 --> 01:14:53,790 Let's do this again. 1501 01:14:53,790 --> 01:14:55,250 Let me go ahead and hit back. 1502 01:14:55,250 --> 01:14:57,265 Let me go ahead here now and say Andy. 1503 01:14:57,265 --> 01:15:00,280 And let's pretend Andy is visiting someone in Weld. 1504 01:15:00,280 --> 01:15:01,280 And click Register. 1505 01:15:01,280 --> 01:15:02,340 You're registered really. 1506 01:15:02,340 --> 01:15:04,210 Let's go back to phpLiteAdmin. 1507 01:15:04,210 --> 01:15:05,380 Click Browse. 1508 01:15:05,380 --> 01:15:06,870 And now Andy is in there as well. 1509 01:15:06,870 --> 01:15:10,460 So now we're not just storing things on disk so to speak with CSV files. 1510 01:15:10,460 --> 01:15:12,870 Now we're actually interacting with the SQL database. 1511 01:15:12,870 --> 01:15:18,390 And we did so by using CS50's execute method inside of its SQL database 1512 01:15:18,390 --> 01:15:19,160 library. 1513 01:15:19,160 --> 01:15:21,230 But notice how relatively simple it was. 1514 01:15:21,230 --> 01:15:23,240 We simply used the same syntax with which 1515 01:15:23,240 --> 01:15:27,020 we've been playing around, either the command line or phpLiteAdmin, 1516 01:15:27,020 --> 01:15:29,600 but now doing it an actual Python code. 1517 01:15:29,600 --> 01:15:34,270 So now we literally have the ability programmatically to create data, 1518 01:15:34,270 --> 01:15:40,450 to update data, delete data, or select data, ultimately, from a database. 1519 01:15:40,450 --> 01:15:41,880 So let's do exactly that. 1520 01:15:41,880 --> 01:15:44,900 Let's take things up a notch and do even more than we did last time. 1521 01:15:44,900 --> 01:15:45,780 You know what? 1522 01:15:45,780 --> 01:15:50,770 Let me go ahead now and create another route for slash registrants. 1523 01:15:50,770 --> 01:15:53,370 Suppose I want to make available a web page that shows 1524 01:15:53,370 --> 01:15:55,190 me everyone who has in fact registered. 1525 01:15:55,190 --> 01:15:58,170 So this is a feature we haven't even had yet, though we could have. 1526 01:15:58,170 --> 01:16:01,960 At least with CSVs, we would have had to open up the CSV and iterate over it. 1527 01:16:01,960 --> 01:16:07,810 Now we'll do that even more simply with the registrants method here 1528 01:16:07,810 --> 01:16:13,590 that has rows gets db.execute, select star from registrants. 1529 01:16:13,590 --> 01:16:15,600 So I know from our earlier experimentation, 1530 01:16:15,600 --> 01:16:17,080 select star means get everything. 1531 01:16:17,080 --> 01:16:19,270 From registrants means from that table. 1532 01:16:19,270 --> 01:16:22,440 db.execute is just the CS50 method that's going to execute the SQL 1533 01:16:22,440 --> 01:16:25,190 and return to you-- what? 1534 01:16:25,190 --> 01:16:28,910 I didn't care about a return value earlier, although technically, I 1535 01:16:28,910 --> 01:16:33,440 could have gotten back a value, as you'll see in the documentation. 1536 01:16:33,440 --> 01:16:37,090 But selecting star from registrants. 1537 01:16:37,090 --> 01:16:38,510 What do I want back? 1538 01:16:38,510 --> 01:16:41,930 Well earlier, I proposed that a database table, like a spreadsheet, 1539 01:16:41,930 --> 01:16:44,790 really is just a list of dictionaries. 1540 01:16:44,790 --> 01:16:48,170 And indeed, that's exactly what the CS50 execute method gives you back. 1541 01:16:48,170 --> 01:16:53,830 It will return to you if you've used a select a Python list each 1542 01:16:53,830 --> 01:16:57,630 of whose elements is a Python dict object, which 1543 01:16:57,630 --> 01:16:59,780 means you have access, which each of those 1544 01:16:59,780 --> 01:17:04,150 rows to the field name or column name, the so-called key in a dictionary, 1545 01:17:04,150 --> 01:17:07,250 and the value, the cell in that table. 1546 01:17:07,250 --> 01:17:10,660 So now that I have these rows, what can I actually do with them? 1547 01:17:10,660 --> 01:17:16,360 Well, I'm going to go ahead and render a new template, registrants.html, 1548 01:17:16,360 --> 01:17:18,290 and I'm going to pass in as you might have 1549 01:17:18,290 --> 01:17:23,870 for a past problem all of my registrants by passing in all of these rows. 1550 01:17:23,870 --> 01:17:28,980 So it turns out, templates can be parametrized such 1551 01:17:28,980 --> 01:17:32,400 that I don't just have to spit out some hard-coded registrants.html file. 1552 01:17:32,400 --> 01:17:35,640 I can pass in a key of registrants or call whatever I want, 1553 01:17:35,640 --> 01:17:37,950 and the value that I just got back from the database. 1554 01:17:37,950 --> 01:17:44,560 So that I'm now handing to my templating language, Ginga all of these rows. 1555 01:17:44,560 --> 01:17:50,310 So that suggests that in my template, my so-called view more generally in MVC, 1556 01:17:50,310 --> 01:17:55,870 my view can iterate over those rows and spit out every one of my registrants. 1557 01:17:55,870 --> 01:17:57,060 Let's go ahead and do this. 1558 01:17:57,060 --> 01:18:02,670 Let me go ahead into registrants.html, which will be a new file. 1559 01:18:02,670 --> 01:18:07,640 So let me create a new file here called registrants.html. 1560 01:18:07,640 --> 01:18:10,820 And let me go ahead and make sure that as are others 1561 01:18:10,820 --> 01:18:16,750 extends layout.html so that it looks just like everything else. 1562 01:18:16,750 --> 01:18:20,250 And then let me go ahead and just give it a block title up here, 1563 01:18:20,250 --> 01:18:21,920 so that its 2 is consistent. 1564 01:18:21,920 --> 01:18:23,830 This will be called registrants. 1565 01:18:23,830 --> 01:18:25,430 And now I do nblock. 1566 01:18:25,430 --> 01:18:27,430 So again, this is the Ginga templating language. 1567 01:18:27,430 --> 01:18:30,690 This is not a Python per se, not SQL per se. 1568 01:18:30,690 --> 01:18:34,760 It's just really for rendering a viewer or the aesthetics of my site's. 1569 01:18:34,760 --> 01:18:36,520 Block body. 1570 01:18:36,520 --> 01:18:40,120 And now in here is going to go nblock. 1571 01:18:40,120 --> 01:18:44,160 So the question is, how in this new template file 1572 01:18:44,160 --> 01:18:48,140 do I spit out like a list of registered students? 1573 01:18:48,140 --> 01:18:49,130 Well, list. 1574 01:18:49,130 --> 01:18:52,920 I know from some HTML back from week six, 1575 01:18:52,920 --> 01:18:55,060 I can give myself an unordered list. 1576 01:18:55,060 --> 01:18:57,700 So just a bulleted list like that. 1577 01:18:57,700 --> 01:19:03,900 And now I know I can spit out list items to put values 1578 01:19:03,900 --> 01:19:05,150 next to those bulleted lists. 1579 01:19:05,150 --> 01:19:08,980 But I don't know how many list items to output yet. 1580 01:19:08,980 --> 01:19:11,740 But wait a minute, passed into this template 1581 01:19:11,740 --> 01:19:15,030 was a key called registrants whose value is rows, 1582 01:19:15,030 --> 01:19:16,790 the list I got back from my database. 1583 01:19:16,790 --> 01:19:17,780 So you know what? 1584 01:19:17,780 --> 01:19:21,580 It turns out that in Ginga, you can execute 1585 01:19:21,580 --> 01:19:28,160 essentially Python code that looks like this for registrant and registrants. 1586 01:19:28,160 --> 01:19:30,590 And then down here, let me do N 4. 1587 01:19:30,590 --> 01:19:32,760 So slightly new syntax here. 1588 01:19:32,760 --> 01:19:36,640 So no colon here we're doing 4, and the opposite of it I N4. 1589 01:19:36,640 --> 01:19:39,400 I can now do list item. 1590 01:19:39,400 --> 01:19:41,290 And next to that list item, you know what? 1591 01:19:41,290 --> 01:19:43,498 I'm going to do it and we've seen this syntax before. 1592 01:19:43,498 --> 01:19:47,260 Registrant.name from registrant.dorm. 1593 01:19:47,260 --> 01:19:50,600 1594 01:19:50,600 --> 01:19:54,170 OK, this looks very strange so what's going on? 1595 01:19:54,170 --> 01:19:56,974 First line just means we're inheriting from layout.html. 1596 01:19:56,974 --> 01:20:00,140 So the whole page is going to just plug in values into that template, namely 1597 01:20:00,140 --> 01:20:01,684 a title and a body. 1598 01:20:01,684 --> 01:20:02,600 Title's uninteresting. 1599 01:20:02,600 --> 01:20:04,016 It's just going to be registrants. 1600 01:20:04,016 --> 01:20:08,180 Body gets interesting, but notice it's not hard-coded HTML anymore. 1601 01:20:08,180 --> 01:20:09,890 We've got an open UL tag and a close UL. 1602 01:20:09,890 --> 01:20:13,000 So this means hey browser, here comes the start of unordered, 1603 01:20:13,000 --> 01:20:14,590 the end of an unordered list. 1604 01:20:14,590 --> 01:20:19,120 But notice that my template or more generally the view in my software 1605 01:20:19,120 --> 01:20:22,350 is now going to use a Python-like loop here. 1606 01:20:22,350 --> 01:20:23,500 But again, no colon. 1607 01:20:23,500 --> 01:20:27,720 You have these special tags with the curly braces and the percent signs. 1608 01:20:27,720 --> 01:20:30,520 And four is the opposite and now notice what 1609 01:20:30,520 --> 01:20:34,720 I want to do inside of this loop I want to output literally 1610 01:20:34,720 --> 01:20:36,610 open bracket LI close bracket. 1611 01:20:36,610 --> 01:20:40,210 And then eventually, open bracket slash LI closed bracket. 1612 01:20:40,210 --> 01:20:43,700 And then dynamically, on each iteration of this loop, 1613 01:20:43,700 --> 01:20:47,470 I want to output a registrant's name and a registrant's dorm 1614 01:20:47,470 --> 01:20:51,100 and just grammatically say from in-between, so it's David from Matthews 1615 01:20:51,100 --> 01:20:55,690 and some Zamyla from Currior and Rob from Thayer and so. 1616 01:20:55,690 --> 01:21:00,770 And the only thing that's passed in is this thing here. 1617 01:21:00,770 --> 01:21:03,450 Remember the named arguments, the named parameters that I 1618 01:21:03,450 --> 01:21:05,090 passed into my render template method. 1619 01:21:05,090 --> 01:21:06,870 It was registrants equals rows. 1620 01:21:06,870 --> 01:21:10,600 This is really equivalent to iterating over the rows that 1621 01:21:10,600 --> 01:21:12,260 came back from my database. 1622 01:21:12,260 --> 01:21:14,350 In fact, if you want to be even more l I don't 1623 01:21:14,350 --> 01:21:17,730 have to call registrants I could just say rows equals rows. 1624 01:21:17,730 --> 01:21:22,340 So I'm literally passing in my database rows, and then in my template here, 1625 01:21:22,340 --> 01:21:30,740 I could do for row in rows row.name, registrant, row.dorm. 1626 01:21:30,740 --> 01:21:34,120 Just semantically, I thought it would be a little more intuitive if I actually 1627 01:21:34,120 --> 01:21:37,480 say what these things are and not just generically refer to them 1628 01:21:37,480 --> 01:21:39,450 as rows from my table. 1629 01:21:39,450 --> 01:21:41,890 All l so let's see this now. 1630 01:21:41,890 --> 01:21:44,010 Here that web form. 1631 01:21:44,010 --> 01:21:48,340 Let me now go to slash registrants, enter. 1632 01:21:48,340 --> 01:21:51,020 And oh my god, a bulleted list. 1633 01:21:51,020 --> 01:21:53,900 This may be-- I want to prove that this is working. 1634 01:21:53,900 --> 01:21:55,050 Let me go back here. 1635 01:21:55,050 --> 01:21:57,450 Let me go ahead and register Stelios now, 1636 01:21:57,450 --> 01:22:00,260 who is currently living in Canaday, register. 1637 01:22:00,260 --> 01:22:03,780 OK, Let me go to slash registrants again. 1638 01:22:03,780 --> 01:22:05,160 Reload. 1639 01:22:05,160 --> 01:22:07,200 Stelios from Canaday. 1640 01:22:07,200 --> 01:22:08,310 Very interesting. 1641 01:22:08,310 --> 01:22:10,660 But this isn't where we need to stop. 1642 01:22:10,660 --> 01:22:12,500 Let me add one other feature. 1643 01:22:12,500 --> 01:22:14,650 Lets get that Stelios out of there. 1644 01:22:14,650 --> 01:22:17,860 I suppose that it wasn't quite making it on the team 1645 01:22:17,860 --> 01:22:20,630 here so you know what, we want to have one other route. 1646 01:22:20,630 --> 01:22:24,420 Let's have an unregisterred function if Stelios wants to bow out after all 1647 01:22:24,420 --> 01:22:25,870 and focus on something else. 1648 01:22:25,870 --> 01:22:29,707 So apt out route slash unregister. 1649 01:22:29,707 --> 01:22:30,290 You know what? 1650 01:22:30,290 --> 01:22:32,040 This is going to support multiple methods. 1651 01:22:32,040 --> 01:22:33,420 So by default, it's just get. 1652 01:22:33,420 --> 01:22:38,390 But I want it to be not just get but get and post. 1653 01:22:38,390 --> 01:22:42,554 And then in here I'm going to have def unregister as the name of my method. 1654 01:22:42,554 --> 01:22:43,720 Again, it could be anything. 1655 01:22:43,720 --> 01:22:45,990 But you should be consistent, I would say. 1656 01:22:45,990 --> 01:22:47,920 And now I'm going to have two conditions. 1657 01:22:47,920 --> 01:22:52,970 So this slash unregister page is visited via gets. 1658 01:22:52,970 --> 01:22:54,420 You just go to this URL l. 1659 01:22:54,420 --> 01:22:56,850 Then I just want to see a same list of registrants 1660 01:22:56,850 --> 01:23:00,120 but with some kind of form by which I can delete them from my database, 1661 01:23:00,120 --> 01:23:03,180 by which they or I can unregister them. 1662 01:23:03,180 --> 01:23:06,844 Meanwhile, I want to go ahead and if post, 1663 01:23:06,844 --> 01:23:08,260 I want to acually do the deletion. 1664 01:23:08,260 --> 01:23:09,190 So let me try this. 1665 01:23:09,190 --> 01:23:16,630 If request.method equals equals gets, then Rose get's d.b. 1666 01:23:16,630 --> 01:23:20,980 Execute, select star from registrants. 1667 01:23:20,980 --> 01:23:28,720 And then I'm going to go ahead and return the templates, render template, 1668 01:23:28,720 --> 01:23:31,520 unregister.hteml. 1669 01:23:31,520 --> 01:23:36,680 And pass in those registrants as those rows. 1670 01:23:36,680 --> 01:23:37,810 Actually, c with. 1671 01:23:37,810 --> 01:23:42,030 Lith requests that method equals equals post. 1672 01:23:42,030 --> 01:23:50,940 What I want to do now, I want to do if request.form, quote unquote "ID." 1673 01:23:50,940 --> 01:23:55,060 So if there is, in fact, an ID passed in-- more on that in a moment-- 1674 01:23:55,060 --> 01:23:57,510 I want to go ahead and execute-- hmm, where is this going? 1675 01:23:57,510 --> 01:23:59,500 Let me go ahead and see the template first. 1676 01:23:59,500 --> 01:24:00,710 I think we need to see this. 1677 01:24:00,710 --> 01:24:01,870 Let me go into froshims2. 1678 01:24:01,870 --> 01:24:03,680 And I whipped this one up in advance. 1679 01:24:03,680 --> 01:24:07,140 unregister.html-- let's go ahead and do this. 1680 01:24:07,140 --> 01:24:10,810 So on registrants.html, I just spit out an unordered list 1681 01:24:10,810 --> 01:24:12,100 of all the registrants. 1682 01:24:12,100 --> 01:24:16,260 In unregister.html, I'm going to add a little bit more. 1683 01:24:16,260 --> 01:24:18,680 Inside of each of those list items, I'm going 1684 01:24:18,680 --> 01:24:24,780 to have a forms input field-- input element-- whose name is going to be ID. 1685 01:24:24,780 --> 01:24:26,530 I've just hard-coded that, because this is 1686 01:24:26,530 --> 01:24:30,200 going to represent the ID of the student I want to unregister. 1687 01:24:30,200 --> 01:24:32,190 The type of this input is going to be radio. 1688 01:24:32,190 --> 01:24:37,250 So these are mutually exclusive circles that you can essentially toggle. 1689 01:24:37,250 --> 01:24:40,830 Value of this is going to be whatever this current registrant's ID is. 1690 01:24:40,830 --> 01:24:43,640 And then I'm going to go ahead and put this registrant's name 1691 01:24:43,640 --> 01:24:45,314 and dorm just as before. 1692 01:24:45,314 --> 01:24:47,230 So we'll see what this looks like in a moment. 1693 01:24:47,230 --> 01:24:50,950 But notice I've also added two other lines up here. 1694 01:24:50,950 --> 01:24:56,270 I've specified I'm inside of a form, the URL for which is unregister. 1695 01:24:56,270 --> 01:24:59,670 So we've seen this before. url_for() is just a function that comes with Flask 1696 01:24:59,670 --> 01:25:03,550 that helps you dynamically figure out what the actual URL should be 1697 01:25:03,550 --> 01:25:05,930 for the method called unregister(). 1698 01:25:05,930 --> 01:25:08,360 So that's why, if you adopt some nice naming conventions, 1699 01:25:08,360 --> 01:25:12,590 you can use tricks like this, and Flask will just figure out to what URL 1700 01:25:12,590 --> 01:25:13,840 this form should be submitted. 1701 01:25:13,840 --> 01:25:15,860 And the method I'm going to use is going to be POST. 1702 01:25:15,860 --> 01:25:18,049 And then down here, notice I have a Submit button. 1703 01:25:18,049 --> 01:25:18,840 Type equals submit. 1704 01:25:18,840 --> 01:25:20,140 Value equals unregister. 1705 01:25:20,140 --> 01:25:21,800 So what does this look like? 1706 01:25:21,800 --> 01:25:22,950 Well, let's see that. 1707 01:25:22,950 --> 01:25:32,500 If I go ahead and visit /unregister, I see an ordered list, just like before. 1708 01:25:32,500 --> 01:25:37,730 And if I played with CSS, I could even get rid of these bullets 1709 01:25:37,730 --> 01:25:42,060 altogether, each of which has a radio button next to it, one of which 1710 01:25:42,060 --> 01:25:45,912 I can click in order to delete this user from my database. 1711 01:25:45,912 --> 01:25:46,870 But how does this work? 1712 01:25:46,870 --> 01:25:50,580 Let me go ahead in Chrome now, and inspect, and look 1713 01:25:50,580 --> 01:25:54,730 under elements at one such row. 1714 01:25:54,730 --> 01:26:00,440 And notice the HTML that Python, and in turn, my Jinja template 1715 01:26:00,440 --> 01:26:02,380 has dynamically output it for me. 1716 01:26:02,380 --> 01:26:04,630 Here's my page's HTML. 1717 01:26:04,630 --> 01:26:08,509 If I expand this list item, I see input name equals ID, and I hard-coded that. 1718 01:26:08,509 --> 01:26:10,050 Type equals radio, I hard-coded that. 1719 01:26:10,050 --> 01:26:13,652 Value equals-- this was registrant.id. 1720 01:26:13,652 --> 01:26:18,530 So in this Jinja for loop where I'm spitting out one registrant at a time, 1721 01:26:18,530 --> 01:26:20,610 this is where I was dynamically spinning out 1722 01:26:20,610 --> 01:26:23,540 Maria, and Andi, and now Stelios' ID. 1723 01:26:23,540 --> 01:26:30,360 So if I expand, the LIs above will see that, ahh, Andi has a value of 2, 1724 01:26:30,360 --> 01:26:31,640 because her ID is 2. 1725 01:26:31,640 --> 01:26:35,650 And Maria has an ID of 1, because her ID was 1. 1726 01:26:35,650 --> 01:26:38,160 Meanwhile, Stelios from Canada is just text. 1727 01:26:38,160 --> 01:26:41,530 So if I select him and then click this button, 1728 01:26:41,530 --> 01:26:43,200 we need to be able to handle that. 1729 01:26:43,200 --> 01:26:44,630 So what do I want to do? 1730 01:26:44,630 --> 01:26:49,710 Let me go back into CS50 IDE, into application.py. 1731 01:26:49,710 --> 01:26:50,880 And here we go. 1732 01:26:50,880 --> 01:26:55,350 If the form was submitted via POST, as by clicking that Submit button, 1733 01:26:55,350 --> 01:26:58,810 and if there is indeed an ID in the form-- 1734 01:26:58,810 --> 01:27:01,930 so the user actually did select one of those radio buttons, 1735 01:27:01,930 --> 01:27:06,840 and therefore, there's some actual work to do-- DELETE FROM registrants 1736 01:27:06,840 --> 01:27:13,040 WHERE ID equals colon ID, where, again, colon ID is just my placeholder. 1737 01:27:13,040 --> 01:27:18,559 The value I want to plug in for ID is request.form quote unquote "ID." 1738 01:27:18,559 --> 01:27:19,600 Where did that come from? 1739 01:27:19,600 --> 01:27:26,740 Again, if we go back to my HTML, notice that these radio buttons are all 1740 01:27:26,740 --> 01:27:28,530 called "ID." 1741 01:27:28,530 --> 01:27:32,280 And because they are mutually exclusive by definition of a radio button, 1742 01:27:32,280 --> 01:27:35,127 only one ID will be submitted if one of these boxes is checked, 1743 01:27:35,127 --> 01:27:36,460 one of these circles is checked. 1744 01:27:36,460 --> 01:27:40,070 And that value submitted will be 1, 2, or 3. 1745 01:27:40,070 --> 01:27:42,940 So to confirm, here's our page with unregister, 1746 01:27:42,940 --> 01:27:45,120 and all three students are still in there. 1747 01:27:45,120 --> 01:27:47,010 Here is /registrants. 1748 01:27:47,010 --> 01:27:48,300 All three students are there. 1749 01:27:48,300 --> 01:27:50,830 Although this is uneditable, let's go ahead 1750 01:27:50,830 --> 01:27:54,935 and unregister Stelios by clicking on Register. 1751 01:27:54,935 --> 01:27:57,560 And it would seem that he's indeed gone from the bulleted list. 1752 01:27:57,560 --> 01:27:59,210 Let's go ahead and check phpLiteAdmin. 1753 01:27:59,210 --> 01:28:01,350 Let's go ahead and click Browse. 1754 01:28:01,350 --> 01:28:03,622 And Stelios is now gone. 1755 01:28:03,622 --> 01:28:06,580 So this is where everything's finally starting to come together, right? 1756 01:28:06,580 --> 01:28:09,670 In week 6, we talked about HTTP, and parameters, 1757 01:28:09,670 --> 01:28:11,590 and how HTML and CSS worked. 1758 01:28:11,590 --> 01:28:13,331 But it was largely static and hard-coded, 1759 01:28:13,331 --> 01:28:15,330 and we were just playing around with fake Google 1760 01:28:15,330 --> 01:28:16,890 and implementing our own front end. 1761 01:28:16,890 --> 01:28:20,060 But now I'm using forms again, and using them not only 1762 01:28:20,060 --> 01:28:23,860 to create an interactive UI-- user interface-- for users, 1763 01:28:23,860 --> 01:28:26,770 I'm also now implementing the back end, the server, 1764 01:28:26,770 --> 01:28:32,530 the routes that are capable of getting those HTTP parameters' 1765 01:28:32,530 --> 01:28:37,070 values as with request.form-- or other mechanisms if they come in via GET, 1766 01:28:37,070 --> 01:28:38,990 a different syntax altogether. 1767 01:28:38,990 --> 01:28:41,470 I can get those values and then do something with them 1768 01:28:41,470 --> 01:28:44,340 by combining those values with SQL code. 1769 01:28:44,340 --> 01:28:46,070 And so the last line here that I actually 1770 01:28:46,070 --> 01:28:48,070 should have included for good measure-- I simply 1771 01:28:48,070 --> 01:28:52,630 ran the code that I had pre-written in advance so that I didn't mess up. 1772 01:28:52,630 --> 01:28:54,362 Let me go ahead and do this. 1773 01:28:54,362 --> 01:28:56,320 This is the one line that was technically there 1774 01:28:56,320 --> 01:29:00,480 when I just ran that code even though you didn't see it until just now. 1775 01:29:00,480 --> 01:29:04,460 After all this, I have decided, just to keep the UI pretty simple, 1776 01:29:04,460 --> 01:29:07,036 after trying to delete someone, just go ahead and redirect, 1777 01:29:07,036 --> 01:29:09,410 not to the unregister page again, but to the registrants. 1778 01:29:09,410 --> 01:29:12,060 And this is why, after deleting Stelios, I immediately 1779 01:29:12,060 --> 01:29:15,060 saw a new bulleted list with just Marie and Andi, 1780 01:29:15,060 --> 01:29:20,570 because I redirected the user to the route for registrants, a.k.a. 1781 01:29:20,570 --> 01:29:22,860 /registrants. 1782 01:29:22,860 --> 01:29:25,240 So again, this is where everything's coming together. 1783 01:29:25,240 --> 01:29:26,800 And it's a lot to absorb all at once. 1784 01:29:26,800 --> 01:29:28,767 Because, my god, we had HTML and CSS. 1785 01:29:28,767 --> 01:29:29,850 Then we introduced Python. 1786 01:29:29,850 --> 01:29:30,640 Now we introduced SQL. 1787 01:29:30,640 --> 01:29:32,110 Then, we have Jinja and the templating language. 1788 01:29:32,110 --> 01:29:33,780 And now all of this comes together. 1789 01:29:33,780 --> 01:29:36,540 But again, if you go back in diving into all 1790 01:29:36,540 --> 01:29:40,250 of this, first principles and the definitions of each of these, HTML, 1791 01:29:40,250 --> 01:29:43,790 it's just the markup language that lets us lay out and format a web page. 1792 01:29:43,790 --> 01:29:46,970 HTTP is just the language-- or the protocol, 1793 01:29:46,970 --> 01:29:51,180 technically-- via which web browsers and servers intercommunicate. 1794 01:29:51,180 --> 01:29:53,220 Python, of course, is a higher-level language. 1795 01:29:53,220 --> 01:29:54,280 It's an alternative to C. 1796 01:29:54,280 --> 01:29:56,990 And it seems to come with a whole bunch of useful functionality 1797 01:29:56,990 --> 01:30:01,860 that, thanks to frameworks, or micro-frameworks like Flask, 1798 01:30:01,860 --> 01:30:06,610 make it relatively easy to get real work done with relatively few lines of code. 1799 01:30:06,610 --> 01:30:08,410 There's a learning curve, to be sure. 1800 01:30:08,410 --> 01:30:11,290 But you know, this is kind of impressive, that with just a dozen 1801 01:30:11,290 --> 01:30:14,240 or two lines of code, I've implemented the beginnings 1802 01:30:14,240 --> 01:30:16,260 of a web-based application by which students 1803 01:30:16,260 --> 01:30:18,560 can register for sports, unregister for sports, 1804 01:30:18,560 --> 01:30:20,599 see who's registered for sports. 1805 01:30:20,599 --> 01:30:23,640 You know, I might just need to add some logins, and a few other features, 1806 01:30:23,640 --> 01:30:25,740 and definitely, some prettier aesthetics. 1807 01:30:25,740 --> 01:30:29,330 But that's a lot of functionality packed into just a few lines. 1808 01:30:29,330 --> 01:30:33,790 And now that we have SQL today and we have a function like db.execute() that 1809 01:30:33,790 --> 01:30:37,180 allows me to execute SQL inside of my Python code, 1810 01:30:37,180 --> 01:30:41,160 now we have the ability to store data long-term, to access it, search it. 1811 01:30:41,160 --> 01:30:42,740 And we're just using small data sets. 1812 01:30:42,740 --> 01:30:45,430 I could store thousands, tens of thousands of rows 1813 01:30:45,430 --> 01:30:47,370 and use these same principles, especially 1814 01:30:47,370 --> 01:30:51,160 for data science applications, analytics, analyzing corpuses of text. 1815 01:30:51,160 --> 01:30:54,440 So many possible applications now. 1816 01:30:54,440 --> 01:30:58,120 And you know what, if we will go just one level deeper, 1817 01:30:58,120 --> 01:31:01,872 it turns out that while having programming chops with SQL, 1818 01:31:01,872 --> 01:31:04,580 and knowing SELECT, and INSERT, and DELETE, and JOIN, and CREATE, 1819 01:31:04,580 --> 01:31:07,246 and all of the various keywords we've started to play with today 1820 01:31:07,246 --> 01:31:10,189 and scratched the surface of-- super useful and super powerful 1821 01:31:10,189 --> 01:31:12,730 when you want to analyze your own data, or your own company's 1822 01:31:12,730 --> 01:31:15,010 data, or your own thesis' data, or the like-- 1823 01:31:15,010 --> 01:31:18,800 it turns out that eventually, you even outgrow that level of interest 1824 01:31:18,800 --> 01:31:19,300 typically. 1825 01:31:19,300 --> 01:31:23,180 And an additional layer of abstraction is often helpful. 1826 01:31:23,180 --> 01:31:26,460 And so another feature you get with frameworks 1827 01:31:26,460 --> 01:31:30,320 like Flask is what are called models-- literally, models. 1828 01:31:30,320 --> 01:31:32,160 So you recall that we've been talking about, 1829 01:31:32,160 --> 01:31:37,730 in general, MVC, whereby we have models, and views, and controllers. 1830 01:31:37,730 --> 01:31:40,910 Well, in this model, we have been interacting 1831 01:31:40,910 --> 01:31:43,140 with our data via low-level SQL. 1832 01:31:43,140 --> 01:31:45,230 It's new, for sure, today. 1833 01:31:45,230 --> 01:31:47,670 But it turns out that once you get comfortable with SQL, 1834 01:31:47,670 --> 01:31:50,230 and so long as you adopt certain conventions of giving 1835 01:31:50,230 --> 01:31:53,290 all of your tables an ID field-- and if you have foreign keys, 1836 01:31:53,290 --> 01:31:56,520 it's something, underscore, ID-- where you generally 1837 01:31:56,520 --> 01:32:01,080 adhere to certain conventions and adhere to a framework's requirements, 1838 01:32:01,080 --> 01:32:04,000 it turns out you can do things like this. 1839 01:32:04,000 --> 01:32:08,590 In froshims3, we have essentially the same files, 1840 01:32:08,590 --> 01:32:12,970 except that we've changed the application.py 1841 01:32:12,970 --> 01:32:14,340 to be a little bit different. 1842 01:32:14,340 --> 01:32:20,000 We're not using any of CS50's package or module now, 1843 01:32:20,000 --> 01:32:22,680 so we're not using db.execute(). 1844 01:32:22,680 --> 01:32:25,750 And in fact, you don't technically need to use that to access SQL. 1845 01:32:25,750 --> 01:32:28,720 Our single-function execute just makes it much easier 1846 01:32:28,720 --> 01:32:31,010 to get back lists of dictionaries as opposed 1847 01:32:31,010 --> 01:32:34,380 to executing multiple lines of code as you could do with a library called 1848 01:32:34,380 --> 01:32:39,490 SQLAlchemy or Postgres' own-- or rather, SQLite's own-- driver 1849 01:32:39,490 --> 01:32:41,800 in the world of Python. 1850 01:32:41,800 --> 01:32:46,230 But I'm going to add a few lines here using another library called 1851 01:32:46,230 --> 01:32:49,720 Flask-SQLAlchemy, which was pre-installed, or will 1852 01:32:49,720 --> 01:32:52,220 be pre-installed, for you in CS50 IDE. 1853 01:32:52,220 --> 01:32:54,760 There's a few lines that I had to copy and paste earlier, 1854 01:32:54,760 --> 01:32:56,759 from the documentation, to get it to work right. 1855 01:32:56,759 --> 01:32:58,960 But notice, this is, perhaps, the familiar line. 1856 01:32:58,960 --> 01:33:01,890 Instead of froshims2, it's now forshims3.db. 1857 01:33:01,890 --> 01:33:05,150 But notice, at the end, I get another db object. 1858 01:33:05,150 --> 01:33:06,290 It's not CS50's. 1859 01:33:06,290 --> 01:33:08,580 It now belongs to the author-- or it was created 1860 01:33:08,580 --> 01:33:10,440 by the author-- of this library. 1861 01:33:10,440 --> 01:33:13,550 But it turns out you can do some pretty cool things as follows. 1862 01:33:13,550 --> 01:33:18,285 Recall, from last week, that Python supports classes much like C 1863 01:33:18,285 --> 01:33:19,160 supports structures. 1864 01:33:19,160 --> 01:33:22,710 And inside of classes can go properties, or pieces of data, 1865 01:33:22,710 --> 01:33:24,820 as well as methods, or functions. 1866 01:33:24,820 --> 01:33:28,880 So it turns out that we can define, using Flask 1867 01:33:28,880 --> 01:33:33,820 and using libraries or frameworks like it, what's called an Object Relational 1868 01:33:33,820 --> 01:33:35,610 Mapper, or ORM. 1869 01:33:35,610 --> 01:33:37,870 And this is just a fancy way of saying, if you 1870 01:33:37,870 --> 01:33:40,820 don't want to think about your data as rows and columns, 1871 01:33:40,820 --> 01:33:43,070 which we have been all of today, ultimately, 1872 01:33:43,070 --> 01:33:45,420 whether in spreadsheet form or database form, 1873 01:33:45,420 --> 01:33:49,070 you'd really like to think of a registrant for a freshman intramural 1874 01:33:49,070 --> 01:33:53,680 sport as an entity, as an object of some class, well, you can do that. 1875 01:33:53,680 --> 01:33:55,930 You can declare a class called registrant 1876 01:33:55,930 --> 01:33:58,650 and have it extend the db.Model. 1877 01:33:58,650 --> 01:34:02,860 So this is another class that comes with Flask that we are now inheriting from, 1878 01:34:02,860 --> 01:34:03,540 so to speak. 1879 01:34:03,540 --> 01:34:06,070 So this is truly now object oriented programming. 1880 01:34:06,070 --> 01:34:11,810 We are specifying, via __tablename, that the SQL table to which this class 1881 01:34:11,810 --> 01:34:14,540 should map is going to be called registrants. 1882 01:34:14,540 --> 01:34:19,030 And ultimately, this class, registrants, when it is instantiated 1883 01:34:19,030 --> 01:34:24,960 is going to give me an object that represents a row in that table. 1884 01:34:24,960 --> 01:34:28,910 And this object is going to have an ID, a name, and a dorm, 1885 01:34:28,910 --> 01:34:29,950 as we've been doing. 1886 01:34:29,950 --> 01:34:35,890 And notice here, using SQLAlchemy-- so using this library from Python-- 1887 01:34:35,890 --> 01:34:39,240 I am declaring a column called ID that's going to be of type integer. 1888 01:34:39,240 --> 01:34:43,090 And yes, it's true that it's the primary key, whereas name and dorm are just 1889 01:34:43,090 --> 01:34:44,490 going to be text. 1890 01:34:44,490 --> 01:34:47,760 Because what you can do with SQLAlchemy, and with ORMs, 1891 01:34:47,760 --> 01:34:52,560 more generally, is you can specify, in Python code, or whatever language, 1892 01:34:52,560 --> 01:34:56,720 what your database looks like and what your data therein 1893 01:34:56,720 --> 01:35:01,570 looks like without actually writing raw SQL queries so to speak. 1894 01:35:01,570 --> 01:35:04,240 So even though we've just introduced SQL syntax, 1895 01:35:04,240 --> 01:35:07,740 you can eventually take off that layer altogether and build 1896 01:35:07,740 --> 01:35:10,480 on top of it using objects like this. 1897 01:35:10,480 --> 01:35:13,920 And now, as an aside, this is a constructor or an initialization method 1898 01:35:13,920 --> 01:35:15,907 that you might recall from past problems. 1899 01:35:15,907 --> 01:35:17,740 But let's focus, a little later in the code, 1900 01:35:17,740 --> 01:35:20,580 on why this is actually compelling. 1901 01:35:20,580 --> 01:35:24,690 If we scroll down later in the code, like, to my register route, 1902 01:35:24,690 --> 01:35:27,950 there is no SQL in this implementation of froshims3. 1903 01:35:27,950 --> 01:35:28,680 3 1904 01:35:28,680 --> 01:35:30,340 The first few lines are the same. 1905 01:35:30,340 --> 01:35:33,560 Indeed, only once we get down to this line here 1906 01:35:33,560 --> 01:35:37,740 do we have registrant as a variable, registrant as the class name. 1907 01:35:37,740 --> 01:35:41,390 And we're passing in, apparently, the name and the dorm 1908 01:35:41,390 --> 01:35:43,320 that came from the HTTP request. 1909 01:35:43,320 --> 01:35:46,634 And we're passing those into the registrant class. 1910 01:35:46,634 --> 01:35:49,300 If you recall how classes worked, if they have an init() method, 1911 01:35:49,300 --> 01:35:52,960 you can pass in some default values, name and dorm in this case. 1912 01:35:52,960 --> 01:35:58,590 And that's how we are creating a registrant object of type registrant. 1913 01:35:58,590 --> 01:36:01,690 And now notice what we can do here, db.sessios.add(). 1914 01:36:01,690 --> 01:36:03,980 So this adds to my database sessions, so to speak, 1915 01:36:03,980 --> 01:36:06,990 another feature you get from this particular library. 1916 01:36:06,990 --> 01:36:08,810 We're going to add that registrant. 1917 01:36:08,810 --> 01:36:12,560 And then wonderfully, we're going to commit that registrant. 1918 01:36:12,560 --> 01:36:16,440 In other words, we have created a variable in memory-- specifically, 1919 01:36:16,440 --> 01:36:21,160 called registrant-- who's type, who's data type, is Registrant, capital R. 1920 01:36:21,160 --> 01:36:25,360 And that's simply a class, inside of which is name and dorm. 1921 01:36:25,360 --> 01:36:26,480 There's no ID yet. 1922 01:36:26,480 --> 01:36:31,100 But what's really cool about this ORM is that when you call add and then commit, 1923 01:36:31,100 --> 01:36:34,820 that's like putting it in a database and then hitting Save. 1924 01:36:34,820 --> 01:36:39,730 And the database, because of the code we find up here, 1925 01:36:39,730 --> 01:36:44,420 is going to automatically insert that ID for us. 1926 01:36:44,420 --> 01:36:49,020 And meanwhile, this object now, registrant, is actually going to have, 1927 01:36:49,020 --> 01:36:52,660 inside of it, the ID that was stored in the database. 1928 01:36:52,660 --> 01:36:54,410 So I don't have to worry about insertions. 1929 01:36:54,410 --> 01:36:56,535 I don't have to worry about any updates or deletes. 1930 01:36:56,535 --> 01:37:00,740 I can interact with my data now, completely at a higher level, in Python 1931 01:37:00,740 --> 01:37:05,130 alone and leave it to the ORM-- SQLAlchemy 1932 01:37:05,130 --> 01:37:09,420 here-- to actually do the creation of the SQL statements. 1933 01:37:09,420 --> 01:37:12,140 And if you look later in here, if you'd like to play around, 1934 01:37:12,140 --> 01:37:15,140 you'll see that we've rewritten registrants and unregister as well 1935 01:37:15,140 --> 01:37:18,610 to use SQLAlchemy as opposed to raw SQL queries. 1936 01:37:18,610 --> 01:37:21,370 And for instance, here is how you can get all the registrants 1937 01:37:21,370 --> 01:37:23,240 in your database if using an ORM. 1938 01:37:23,240 --> 01:37:25,860 Instead of doing SELECT* FROM registrants, you can just say, 1939 01:37:25,860 --> 01:37:29,750 hey registrant class, give me a query for all of my data. 1940 01:37:29,750 --> 01:37:32,730 And what you get back is a whole bunch of rows, which, as before, 1941 01:37:32,730 --> 01:37:34,210 we can pass into our template. 1942 01:37:34,210 --> 01:37:36,746 Down here, meanwhile, we can request all of those rows 1943 01:37:36,746 --> 01:37:38,370 again and pass them into that template. 1944 01:37:38,370 --> 01:37:40,290 Or here, notice what we can do. 1945 01:37:40,290 --> 01:37:43,430 If we've been passed the ID of a registrant 1946 01:37:43,430 --> 01:37:47,670 like Stelios', we can say, hey registrant class, give me a query, 1947 01:37:47,670 --> 01:37:52,750 but filter that query so that the registrant ID I care about equals 1948 01:37:52,750 --> 01:37:54,360 the one I was passed via HTTP. 1949 01:37:54,360 --> 01:37:56,930 Oh, and by the way, once you find that in my database, 1950 01:37:56,930 --> 01:38:01,950 call the dot delete method to just get rid of Stelios from the database. 1951 01:38:01,950 --> 01:38:04,510 So again, we might not necessarily be solving a problem 1952 01:38:04,510 --> 01:38:07,289 or scratching an itch just yet, especially since SQL itself 1953 01:38:07,289 --> 01:38:08,580 is, odds are, quite new to you. 1954 01:38:08,580 --> 01:38:10,190 And so we've already solved the problem in one way. 1955 01:38:10,190 --> 01:38:11,490 Here's another way to solve it. 1956 01:38:11,490 --> 01:38:16,980 But realize that, eventually, it's fair to say that you find 1957 01:38:16,980 --> 01:38:18,616 writing SQL queries sometimes tedious. 1958 01:38:18,616 --> 01:38:21,240 Though frankly, you'll get a lot more control, and potentially, 1959 01:38:21,240 --> 01:38:23,600 more performance out of them if writing them yourself. 1960 01:38:23,600 --> 01:38:26,560 And so having the best of both worlds is perhaps the best takeaway 1961 01:38:26,560 --> 01:38:29,380 here-- actually understanding what's going on underneath the hood, 1962 01:38:29,380 --> 01:38:32,487 as was the entire point of our spending so much time in C, 1963 01:38:32,487 --> 01:38:34,570 and understanding how you can execute SQL queries, 1964 01:38:34,570 --> 01:38:38,360 but realizing, down the road, especially if you find that, wow, it's really 1965 01:38:38,360 --> 01:38:41,250 getting a little slow to write all these low level SQL queries, 1966 01:38:41,250 --> 01:38:43,710 wouldn't it be nice to just create my database schema 1967 01:38:43,710 --> 01:38:46,360 as I did earlier with phpLiteAdmin or at the command line 1968 01:38:46,360 --> 01:38:49,770 and then let my library code figure out how 1969 01:38:49,770 --> 01:38:52,600 to get data in and out for me, albeit perhaps at a performance 1970 01:38:52,600 --> 01:38:54,810 penalty, that's a nice place to get to. 1971 01:38:54,810 --> 01:38:58,150 So again, even now that we're in week 9, and we've 1972 01:38:58,150 --> 01:39:02,990 abstracted so far away from week 0s and 1s, 1973 01:39:02,990 --> 01:39:07,490 is there still this progression and this onward march of abstraction 1974 01:39:07,490 --> 01:39:10,740 as the world gets more and more familiar with solving problems and starts 1975 01:39:10,740 --> 01:39:14,630 to realize best designs for doing so? 1976 01:39:14,630 --> 01:39:18,550 But it's not all fine, and good, and safe. 1977 01:39:18,550 --> 01:39:23,120 In fact, let's make note of perhaps one of the most tragically common mistakes 1978 01:39:23,120 --> 01:39:25,850 people make when using SQL. 1979 01:39:25,850 --> 01:39:27,850 And indeed, one of the reasons to use things 1980 01:39:27,850 --> 01:39:31,540 like libraries like CS50's library or even higher-level, fancier 1981 01:39:31,540 --> 01:39:35,530 libraries like SQLAlchemy, is to avoid these kinds of threats. 1982 01:39:35,530 --> 01:39:37,960 And yet many people and many sites still suffer 1983 01:39:37,960 --> 01:39:42,200 from what are called SQL injection attacks, for instance. 1984 01:39:42,200 --> 01:39:43,165 So what does this mean? 1985 01:39:43,165 --> 01:39:47,010 Well, probably, a few times a week, you log in with your Yale NetID 1986 01:39:47,010 --> 01:39:52,150 or with your HarvardKey, which gives you forms like this or like this. 1987 01:39:52,150 --> 01:39:54,430 And ultimately, you're providing. 1988 01:39:54,430 --> 01:39:56,260 simply, a username and a password. 1989 01:39:56,260 --> 01:39:58,430 But suppose that, for the sake of discussion, 1990 01:39:58,430 --> 01:40:03,280 the HarvardKey system were implemented on the back end with Python using SQL. 1991 01:40:03,280 --> 01:40:05,660 And how, then, do we implement logins? 1992 01:40:05,660 --> 01:40:09,640 Well, when I give Harvard or Yale my login name or my NetID 1993 01:40:09,640 --> 01:40:11,820 and then my password, well, what are they doing? 1994 01:40:11,820 --> 01:40:14,490 They probably have, each of them, a users 1995 01:40:14,490 --> 01:40:18,763 table if they're using SQL, whether it's a SQLite, or Oracle, or MySQL, 1996 01:40:18,763 --> 01:40:20,600 or Postgres, or whatever. 1997 01:40:20,600 --> 01:40:25,690 And they probably have written code somewhere in that login site that says 1998 01:40:25,690 --> 01:40:31,070 SELECT* FROM users WHERE username equals whatever they typed in AND password 1999 01:40:31,070 --> 01:40:32,610 equals whatever they typed in. 2000 01:40:32,610 --> 01:40:37,080 And if that gives you back a row representing David, for instance, 2001 01:40:37,080 --> 01:40:39,314 then you know that he or she has logged in correctly, 2002 01:40:39,314 --> 01:40:42,480 because you wouldn't have found the row if the username and password weren't 2003 01:40:42,480 --> 01:40:43,460 in the database. 2004 01:40:43,460 --> 01:40:46,014 Now, it turns out fancier things are done with the password. 2005 01:40:46,014 --> 01:40:48,680 You probably don't want to store users' passwords in plain text, 2006 01:40:48,680 --> 01:40:51,060 so to speak, borrowing language from week 2. 2007 01:40:51,060 --> 01:40:54,650 Rather, you want to store ciphertext or some kind of hashed value 2008 01:40:54,650 --> 01:40:58,220 so that even if, in the worst case, your database is compromised or stolen, 2009 01:40:58,220 --> 01:41:02,050 no one in the real world actually sees your users' passwords but only 2010 01:41:02,050 --> 01:41:05,710 some cryptic-looking hashes thereof, which at least raises 2011 01:41:05,710 --> 01:41:08,230 the bar to exploiting your account. 2012 01:41:08,230 --> 01:41:12,610 But if they're using SQL-- this is what worries me-- they are, at some point, 2013 01:41:12,610 --> 01:41:15,430 taking what I, a human, typed in-- hopefully a good guy, 2014 01:41:15,430 --> 01:41:17,930 but could be a bad guy, typed in-- to their website 2015 01:41:17,930 --> 01:41:20,260 and plugging it into a SQL query. 2016 01:41:20,260 --> 01:41:23,310 Because they're not just going to necessarily do SELECT* FROM users. 2017 01:41:23,310 --> 01:41:26,560 They might actually say, SELECT* FROM users WHERE username equals such 2018 01:41:26,560 --> 01:41:28,890 and such AND password equals such and such, 2019 01:41:28,890 --> 01:41:31,470 or at least one of those predicates. 2020 01:41:31,470 --> 01:41:36,250 So what if you've done things poorly in code? 2021 01:41:36,250 --> 01:41:39,820 And suppose that Harvard had implemented it in such a way 2022 01:41:39,820 --> 01:41:44,420 that this simple-looking, stupid-looking query is actually a really big threat? 2023 01:41:44,420 --> 01:41:46,170 So I've temporarily turned off the bullets 2024 01:41:46,170 --> 01:41:47,590 that you would normally see in a password form. 2025 01:41:47,590 --> 01:41:48,298 This is not hard. 2026 01:41:48,298 --> 01:41:49,590 It's just an HTML thing. 2027 01:41:49,590 --> 01:41:53,862 And suppose that my email addresses is me@examplemailprovider.com. 2028 01:41:53,862 --> 01:41:57,490 And suppose, for my password, I don't type in 12345, 2029 01:41:57,490 --> 01:41:59,230 or whatever my actual password is. 2030 01:41:59,230 --> 01:42:02,030 I type in, cryptically, ' OR ''1='1. 2031 01:42:02,030 --> 01:42:08,360 2032 01:42:08,360 --> 01:42:09,010 Why this? 2033 01:42:09,010 --> 01:42:10,760 And there's an infinite number of things I 2034 01:42:10,760 --> 01:42:17,070 could type if I am aggressively trying to hack into Harvard or Yale's website. 2035 01:42:17,070 --> 01:42:20,750 But notice this feels like it's part of a logical query. 2036 01:42:20,750 --> 01:42:22,190 It turns out SQL has OR. 2037 01:42:22,190 --> 01:42:24,606 We haven't seen that before, but like Python, it literally 2038 01:42:24,606 --> 01:42:25,730 has the keyword "or." 2039 01:42:25,730 --> 01:42:30,070 And I am assuming, in this case, per some of the examples, 2040 01:42:30,070 --> 01:42:32,740 that maybe the programmer at Harvard or Yale 2041 01:42:32,740 --> 01:42:34,910 has single quotes in his or her code. 2042 01:42:34,910 --> 01:42:42,440 And maybe they are just foolishly, and very riskily, plugging in what I type 2043 01:42:42,440 --> 01:42:43,920 in between those quotes. 2044 01:42:43,920 --> 01:42:45,837 So notice, this is kind of the end of a quote. 2045 01:42:45,837 --> 01:42:48,919 This is the beginning of a quote, but I've not finished the thought there. 2046 01:42:48,919 --> 01:42:50,024 So let's see what happens. 2047 01:42:50,024 --> 01:42:52,440 Suppose that the code on the back end at Harvard or Yale-- 2048 01:42:52,440 --> 01:42:54,750 we're in Python-- this. 2049 01:42:54,750 --> 01:43:00,080 So somewhere in their files, username variable gets request.form username. 2050 01:43:00,080 --> 01:43:01,840 Password gets request.form password. 2051 01:43:01,840 --> 01:43:05,130 So just two variables called username and password, 2052 01:43:05,130 --> 01:43:06,810 just so I have them handy. 2053 01:43:06,810 --> 01:43:08,930 Maybe they're using CS50's library function. 2054 01:43:08,930 --> 01:43:13,460 db.execute( SELECT* FROM users-- and this happens to wrap on two lines, 2055 01:43:13,460 --> 01:43:19,430 but ignore the extra space-- WHERE username equals {} AND password equals 2056 01:43:19,430 --> 01:43:19,930 {}). 2057 01:43:19,930 --> 01:43:24,560 So just like we did last week when printing things in a formatted fashion, 2058 01:43:24,560 --> 01:43:27,110 this was like Python's equivalent of print diff 2059 01:43:27,110 --> 01:43:32,710 using the format method of the string class passing in username and password. 2060 01:43:32,710 --> 01:43:33,920 But this worries me. 2061 01:43:33,920 --> 01:43:38,070 Because if the programmer at Harvard or Yale has literally given me these 2062 01:43:38,070 --> 01:43:43,590 placeholders of '{}', whatever I typed in as my username and my password is 2063 01:43:43,590 --> 01:43:47,210 literally going to go there and there. 2064 01:43:47,210 --> 01:43:51,190 But what if-- oop, oop, oop, oop, typo. 2065 01:43:51,190 --> 01:43:52,970 The whole story breaks without this. 2066 01:43:52,970 --> 01:44:00,430 And there-- but what if the user types in that cryptic-looking string? 2067 01:44:00,430 --> 01:44:03,450 Well, then, what ends up happening is this. 2068 01:44:03,450 --> 01:44:05,320 And it's red because red is bad. 2069 01:44:05,320 --> 01:44:08,640 SELECT* star FROM users WHERE username name equals quote, unquote, 2070 01:44:08,640 --> 01:44:13,380 me@examplemailprovider.com-- no big deal-- AND password equals-- 2071 01:44:13,380 --> 01:44:15,500 this is interesting. 2072 01:44:15,500 --> 01:44:19,800 Previously, there was a single quote and a single quote, 2073 01:44:19,800 --> 01:44:22,410 and then the curly braces in between. 2074 01:44:22,410 --> 01:44:26,600 But underlined here is what I, the adversary, typed in. 2075 01:44:26,600 --> 01:44:32,240 Cryptically, 'OR'1'=1', that's it. 2076 01:44:32,240 --> 01:44:36,530 Notice where the underlining starts and ends. 2077 01:44:36,530 --> 01:44:40,580 But I seem to have maliciously finished that programmer's 2078 01:44:40,580 --> 01:44:41,830 thought at Harvard or Yale. 2079 01:44:41,830 --> 01:44:43,790 I've not finished it in the way they intended. 2080 01:44:43,790 --> 01:44:49,070 But this is syntactically and semantically correct, if a bit strange. 2081 01:44:49,070 --> 01:44:54,170 I am essentially saying, if the password equals nothing or 1 equals 1, 2082 01:44:54,170 --> 01:44:56,420 because I've plugged in those characters in such a way 2083 01:44:56,420 --> 01:45:00,770 that they still make a WHERE clause syntactically accurate. 2084 01:45:00,770 --> 01:45:03,120 1 equals 1 always. 2085 01:45:03,120 --> 01:45:06,700 So this is like saying, SELECT* FROM users WHERE username equals 2086 01:45:06,700 --> 01:45:10,370 me@examplemailprovider and whatever else is the case. 2087 01:45:10,370 --> 01:45:14,730 Like, that is always true, that 1 equals 1. 2088 01:45:14,730 --> 01:45:17,206 So it doesn't even matter what the user's password is 2089 01:45:17,206 --> 01:45:18,580 or that I didn't even type it in. 2090 01:45:18,580 --> 01:45:20,370 I just did quote, unquote. 2091 01:45:20,370 --> 01:45:23,177 All right, well, what if maybe-- OK, so format may be bad. 2092 01:45:23,177 --> 01:45:26,010 Curly braces, like we've been doing in Python for string formatting, 2093 01:45:26,010 --> 01:45:26,790 maybe that's bad. 2094 01:45:26,790 --> 01:45:28,696 Python also has, like Java and JavaScript, 2095 01:45:28,696 --> 01:45:31,070 the ability to concatenate things together, in this case, 2096 01:45:31,070 --> 01:45:32,840 using the plus operator. 2097 01:45:32,840 --> 01:45:36,220 So what if I just go old school and concatenate my strings together 2098 01:45:36,220 --> 01:45:37,260 like this? 2099 01:45:37,260 --> 01:45:40,540 Maybe that's better, noticing single quote here, single quote here. 2100 01:45:40,540 --> 01:45:42,950 And the double quotes are just stopping the string 2101 01:45:42,950 --> 01:45:45,720 while we do this concatenation there and there. 2102 01:45:45,720 --> 01:45:47,570 It's the exact same problem. 2103 01:45:47,570 --> 01:45:50,080 So those instincts don't serve us well at all. 2104 01:45:50,080 --> 01:45:52,250 In the end, we still get, username equals 2105 01:45:52,250 --> 01:45:56,330 whatever I typed in AND password equals this WHERE 1, indeed, again, equals 1. 2106 01:45:56,330 --> 01:45:59,990 So no matter what, this query is going to return a row if that email 2107 01:45:59,990 --> 01:46:04,670 address is in there, probably resulting in the Harvard or Yale code logging 2108 01:46:04,670 --> 01:46:06,490 this malicious user in. 2109 01:46:06,490 --> 01:46:08,060 So there's got to be a better way. 2110 01:46:08,060 --> 01:46:10,680 And indeed, the reason to use libraries in general, 2111 01:46:10,680 --> 01:46:13,090 whether it's CS50's for the next couple of weeks, 2112 01:46:13,090 --> 01:46:16,380 or in the real world, any of dozens of SQL libraries, 2113 01:46:16,380 --> 01:46:20,300 is that other people before you have thought through these threats, 2114 01:46:20,300 --> 01:46:22,760 have written the requisite code, which isn't all that much, 2115 01:46:22,760 --> 01:46:27,030 to notice when there's dangerous queries being injected 2116 01:46:27,030 --> 01:46:28,660 and escape them properly. 2117 01:46:28,660 --> 01:46:32,620 So here's how we would do this with the CS50 db.execute() method. 2118 01:46:32,620 --> 01:46:38,010 SELECT* FROM users WHERE username equals colon username AND password equals 2119 01:46:38,010 --> 01:46:39,990 colon password-- sorry for the type there. 2120 01:46:39,990 --> 01:46:42,140 I'll fix that. 2121 01:46:42,140 --> 01:46:46,150 Bad with spacing today-- AND password equals 2122 01:46:46,150 --> 01:46:52,030 colon password, unquote, passing in username, passing in password. 2123 01:46:52,030 --> 01:46:54,330 And again, you don't repeat the colons here. 2124 01:46:54,330 --> 01:46:58,930 But the key and the key in those named parameters lines up with the key 2125 01:46:58,930 --> 01:47:01,140 and the key that do have the colons. 2126 01:47:01,140 --> 01:47:03,530 And the way the CS50 library works-- and there's not much 2127 01:47:03,530 --> 01:47:04,870 going on in the CS50 library. 2128 01:47:04,870 --> 01:47:06,703 There's not much of a training wheel there-- 2129 01:47:06,703 --> 01:47:10,170 we essentially are simply wrapping that other library I referred to, 2130 01:47:10,170 --> 01:47:14,684 SQLAlchemy, which gives you not only the ORM feature, the Object Relational 2131 01:47:14,684 --> 01:47:17,600 Mapper feature, where you can create your own classes like registrant, 2132 01:47:17,600 --> 01:47:20,360 they also let you, in that library, execute raw SQL. 2133 01:47:20,360 --> 01:47:23,330 And they also take care of all of this escaping. 2134 01:47:23,330 --> 01:47:26,810 So we, the CS50 library, are really just taking your queries, 2135 01:47:26,810 --> 01:47:30,570 passing them to the SQLAlchemy library via raw SQL, getting back the results, 2136 01:47:30,570 --> 01:47:33,480 and just neatening them up, and returning to you only 2137 01:47:33,480 --> 01:47:36,360 a list with dictionaries inside without expecting 2138 01:47:36,360 --> 01:47:38,260 you to execute multiple lines of code. 2139 01:47:38,260 --> 01:47:41,960 And now, in green, is the expected solution here. 2140 01:47:41,960 --> 01:47:45,270 Because of the way the CS50 library works, and in turn, the SQLAlchemy 2141 01:47:45,270 --> 01:47:49,350 library works, even if an adversary types in funky syntax 2142 01:47:49,350 --> 01:47:53,800 like those single quotes trying to trick your database or your Python code 2143 01:47:53,800 --> 01:47:58,980 into executing something malicious, notice what the library has done. 2144 01:47:58,980 --> 01:48:02,800 These backslashes were not in the user's adversarial input. 2145 01:48:02,800 --> 01:48:06,060 They were not in the previous red problematic examples. 2146 01:48:06,060 --> 01:48:09,180 What the CS50 execute() method does for you, 2147 01:48:09,180 --> 01:48:14,270 what the SQLAlchemy library does for you is looks at user input that's been 2148 01:48:14,270 --> 01:48:17,060 plugged in for named parameters and says, whoa-ho, wait a minute. 2149 01:48:17,060 --> 01:48:20,530 If there's a single quote in there, let me escape it with a backslash 2150 01:48:20,530 --> 01:48:24,610 so that the only actual single quotes are the outermost ones. 2151 01:48:24,610 --> 01:48:29,210 And only if the user's password is that thing there that's underlined will 2152 01:48:29,210 --> 01:48:30,100 they actually get in. 2153 01:48:30,100 --> 01:48:34,130 And most likely, it's not going to be something as crazy as that. 2154 01:48:34,130 --> 01:48:37,810 But in light of all this, do you perhaps now appreciate 2155 01:48:37,810 --> 01:48:41,040 what this particular person, perhaps with a little too much free time, 2156 01:48:41,040 --> 01:48:41,830 was trying to do? 2157 01:48:41,830 --> 01:48:44,680 So parked in a parking lot here is this fellow's plate here. 2158 01:48:44,680 --> 01:48:49,020 And this person had taken the time to print out something a little curious. 2159 01:48:49,020 --> 01:48:51,440 Let's enhance. 2160 01:48:51,440 --> 01:48:53,570 What the heck is going on there? 2161 01:48:53,570 --> 01:48:58,240 Well, it turns out that while we've been focusing on the logic of 1 equaling 1, 2162 01:48:58,240 --> 01:49:00,690 the real takeaway of SQL injection attacks 2163 01:49:00,690 --> 01:49:05,960 is that if you can somehow trick a database into executing a line of code 2164 01:49:05,960 --> 01:49:09,390 that you have written-- previously, the only line was, quote, unquote, 2165 01:49:09,390 --> 01:49:10,630 1 equals 1 or what not. 2166 01:49:10,630 --> 01:49:12,530 But suppose that that adversary-- suppose 2167 01:49:12,530 --> 01:49:15,880 I had included a semi-colon in my username 2168 01:49:15,880 --> 01:49:17,580 or a semi-colon in my password. 2169 01:49:17,580 --> 01:49:22,410 And you, the programmer, naively trusted what I was typing in. 2170 01:49:22,410 --> 01:49:27,860 And you simply executed whatever I typed into my username or my password field, 2171 01:49:27,860 --> 01:49:30,050 allowing me to have a semi-colon in there. 2172 01:49:30,050 --> 01:49:34,290 And that semi-colon, as you know, ends one SQL statement and begins a new one. 2173 01:49:34,290 --> 01:49:37,530 Suppose, god forbid, like this person here tried 2174 01:49:37,530 --> 01:49:41,070 to finish his or her license plate in a SQL-like way, 2175 01:49:41,070 --> 01:49:44,490 with a quote here, and commas, whatever those mean, and a semi-colon, 2176 01:49:44,490 --> 01:49:51,536 but then also included a valid SQL command like DROP DATABASE TABLE, which 2177 01:49:51,536 --> 01:49:53,410 is the only thing I cautioned about, earlier, 2178 01:49:53,410 --> 01:49:58,580 being especially bad-- this person was apparently trying to use those traffic 2179 01:49:58,580 --> 01:50:00,730 cameras, which, he or she surmised, might 2180 01:50:00,730 --> 01:50:04,030 have been using SQL as the back end and storing people's license 2181 01:50:04,030 --> 01:50:06,760 plates in that back end and, correctly or incorrectly, 2182 01:50:06,760 --> 01:50:09,500 was hoping that, upon a camera seeing this, 2183 01:50:09,500 --> 01:50:13,120 using Optical Character Recognition, OCR, converting this into text, 2184 01:50:13,120 --> 01:50:19,000 passing that text into a database that might not be scrubbing 2185 01:50:19,000 --> 01:50:21,630 or sanitizing its inputs as we've proposed 2186 01:50:21,630 --> 01:50:27,300 with CS50's library or SQLAlchemy-- was hoping-- that maybe that back end 2187 01:50:27,300 --> 01:50:31,120 database was poorly implemented enough to trust 2188 01:50:31,120 --> 01:50:35,250 what was passed in so that after logging this person for speeding 2189 01:50:35,250 --> 01:50:38,670 or whatever, actually dropped the entire database, covering 2190 01:50:38,670 --> 01:50:41,070 his or her tracks entirely. 2191 01:50:41,070 --> 01:50:45,090 So all this and more is ahead of us as we continue to build, and build, 2192 01:50:45,090 --> 01:50:46,930 and build on top of lessons past. 2193 01:50:46,930 --> 01:50:51,017 And next week, when we introduce one final language, JavaScript, a language 2194 01:50:51,017 --> 01:50:52,850 that you can not only use on the server side 2195 01:50:52,850 --> 01:50:55,190 but also, and especially, on the client side 2196 01:50:55,190 --> 01:50:57,640 to create all the more of an interactive experience 2197 01:50:57,640 --> 01:51:00,500 and all the more of a compelling user experience 2198 01:51:00,500 --> 01:51:04,490 for users using a bit of Python, and SQL, and HTML, and CSS, 2199 01:51:04,490 --> 01:51:08,070 and soon, now, JavaScript, we'll see you then. 2200 01:51:08,070 --> 01:51:11,010 2201 01:51:11,010 --> 01:51:11,990 [MUSIC PLAYING] 2202 01:51:11,990 --> 01:51:19,850 SPEAKER 1: Rosebud-- yeah, actually, now that you mention it, one time, I went 2203 01:51:19,850 --> 01:51:22,692 into his office to look for some forms. 2204 01:51:22,692 --> 01:51:24,180 [KNOCKING ON DOOR] 2205 01:51:24,180 --> 01:51:40,419 2206 01:51:40,419 --> 01:51:41,044 SPEAKER 2: Bud? 2207 01:51:41,044 --> 01:51:44,060 2208 01:51:44,060 --> 01:51:45,960 David always said "pal." 2209 01:51:45,960 --> 01:51:47,238