WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:09.443 --> 00:00:12.680 SPEAKER: So much as you might understand how the internet works, 00:00:12.680 --> 00:00:17.505 whether it's HTTP that transports data or HTML that is that data, 00:00:17.505 --> 00:00:20.630 and as much as you might understand some of the fundamentals of programming 00:00:20.630 --> 00:00:24.290 like loops and conditions and Boolean expressions, variables, and more, 00:00:24.290 --> 00:00:27.350 it turns out there are so many different ways in which you 00:00:27.350 --> 00:00:29.420 can implement those ideas. 00:00:29.420 --> 00:00:31.850 And so, indeed, when it comes time to actually build 00:00:31.850 --> 00:00:35.240 a website, a web application, a mobile application, 00:00:35.240 --> 00:00:39.020 it turns out that it's rather non obvious where to begin some times 00:00:39.020 --> 00:00:42.050 if only because you have so many options ahead of you. 00:00:42.050 --> 00:00:46.010 And much like the world of clothing and the world of fashion, more generally, 00:00:46.010 --> 00:00:50.150 is constantly evolving such that what is cool and appropriate to use now 00:00:50.150 --> 00:00:54.950 or to wear now might not necessarily be appropriate some months or years hence, 00:00:54.950 --> 00:00:59.540 the same can be said for better or for worse of the technology world in so far 00:00:59.540 --> 00:01:01.250 as humans are constantly innovating. 00:01:01.250 --> 00:01:05.450 Humans are constantly finding fault or opportunities for improvement 00:01:05.450 --> 00:01:08.750 in languages that we've used for years, in server software that we've 00:01:08.750 --> 00:01:11.210 used for years, and intuitively improving on it. 00:01:11.210 --> 00:01:15.050 And so the reality is that staying current with this whole world 00:01:15.050 --> 00:01:19.670 takes some effort even as the fundamentals largely remain constant. 00:01:19.670 --> 00:01:23.034 And so what we'll try to do here is give you a sense of some of the languages, 00:01:23.034 --> 00:01:24.950 some of the frameworks, some of the libraries, 00:01:24.950 --> 00:01:29.000 some of the overarching design decisions that are both in vogue and both here 00:01:29.000 --> 00:01:32.930 to stay right now, as well as take the lid off of some of these technologies 00:01:32.930 --> 00:01:35.420 and give you a better understanding of how 00:01:35.420 --> 00:01:38.930 some of the fundamental types of technologies from which you 00:01:38.930 --> 00:01:41.820 can choose actually work. 00:01:41.820 --> 00:01:46.190 So let's consider for the moment the so-called front end of an application. 00:01:46.190 --> 00:01:49.530 Front end generally refers to that which is facing the user. 00:01:49.530 --> 00:01:53.120 So it's the user interface and more with which the human user typically 00:01:53.120 --> 00:01:53.930 interacts. 00:01:53.930 --> 00:01:57.530 Now, we've discussed, for instance, the world of the web 00:01:57.530 --> 00:02:02.720 and how you might assemble a web-based experience using HTML and CSS and, even 00:02:02.720 --> 00:02:05.090 more dynamically, using JavaScript. 00:02:05.090 --> 00:02:07.760 But that's by using those native languages 00:02:07.760 --> 00:02:10.139 right as that come out of the box, so to speak. 00:02:10.139 --> 00:02:13.700 But it turns out that some tasks are not as easily done 00:02:13.700 --> 00:02:17.030 in those various languages as might be convenient. 00:02:17.030 --> 00:02:20.370 It turns out that there are certain design patterns, 00:02:20.370 --> 00:02:23.660 so to speak, types of code, types of markup, 00:02:23.660 --> 00:02:27.080 types of properties that people have found themselves using again 00:02:27.080 --> 00:02:28.350 and again and again. 00:02:28.350 --> 00:02:32.510 And so much like you can factor out into your own CSS files and JavaScript 00:02:32.510 --> 00:02:35.810 files, code that you want to share across multiple files 00:02:35.810 --> 00:02:39.170 or even across multiple projects so can-- 00:02:39.170 --> 00:02:42.050 so has the world more generally realized, you know what? 00:02:42.050 --> 00:02:45.800 Maybe I should package up my CSS or my JavaScript in such a way 00:02:45.800 --> 00:02:48.290 that other people can actually use it as well 00:02:48.290 --> 00:02:52.400 and thus have been born things called libraries, collections of code 00:02:52.400 --> 00:02:56.106 that other people have written that we can all use, often under an open source 00:02:56.106 --> 00:02:58.730 license, which means the code is freely available for the world 00:02:58.730 --> 00:03:03.620 to critique, to use, to adapt some times, and contribute back to. 00:03:03.620 --> 00:03:06.350 Now, within the world of the front end, there 00:03:06.350 --> 00:03:08.210 are so many different JavaScript frameworks. 00:03:08.210 --> 00:03:12.390 Indeed, depicted here just a few of perhaps the most popular right now. 00:03:12.390 --> 00:03:15.380 But even this list is going to grow stale over the coming months, 00:03:15.380 --> 00:03:17.460 certainly over the coming years, and the like. 00:03:17.460 --> 00:03:19.430 And so really rather than dive into the weeds 00:03:19.430 --> 00:03:21.263 of some of these technologies in particular, 00:03:21.263 --> 00:03:23.240 we really aspire to give you just a sense 00:03:23.240 --> 00:03:26.540 of what's current, what should be in your vocabulary perhaps now, 00:03:26.540 --> 00:03:30.350 and perhaps some context when it comes to recruiting engineers or deciding 00:03:30.350 --> 00:03:34.220 among engineers which technologies to build a business on, just 00:03:34.220 --> 00:03:37.070 how current you are, just how dated you are, and the like. 00:03:37.070 --> 00:03:40.100 But invariably this kind of thing requires some due diligence 00:03:40.100 --> 00:03:42.770 when the time comes to design an actual project. 00:03:42.770 --> 00:03:46.760 Angular, Ember, Meteor, React, View, these and more 00:03:46.760 --> 00:03:50.420 are the names for various JavaScript frameworks. 00:03:50.420 --> 00:03:53.400 And a framework is not just a library per se. 00:03:53.400 --> 00:03:56.770 A framework is also typically a way of doing things. 00:03:56.770 --> 00:03:58.760 So a framework includes some code that you 00:03:58.760 --> 00:04:01.910 should integrate into your own projects, whether it's CSS, JavaScript, 00:04:01.910 --> 00:04:05.000 or something more but it also includes typically 00:04:05.000 --> 00:04:08.210 a way of doing things, a way of naming your own files, 00:04:08.210 --> 00:04:13.350 a way of formatting your files, a way of building ultimately your application. 00:04:13.350 --> 00:04:15.350 And reasonable people, of course, will disagree, 00:04:15.350 --> 00:04:19.430 and so you'll find among several of these frameworks different design 00:04:19.430 --> 00:04:23.990 paradigms, different design beliefs, the best way as to do things. 00:04:23.990 --> 00:04:26.540 And again different-- reasonable people will disagree, 00:04:26.540 --> 00:04:29.900 and so part of the process of choosing these frameworks 00:04:29.900 --> 00:04:32.920 really boils down to what resonates most with you 00:04:32.920 --> 00:04:35.450 or with the engineers with whom you're working. 00:04:35.450 --> 00:04:39.140 And indeed what resonates above all else perhaps 00:04:39.140 --> 00:04:41.570 is what one is most familiar with. 00:04:41.570 --> 00:04:45.860 In fact, it's often the case that you or engineers you're working with simply 00:04:45.860 --> 00:04:49.370 have done a previous project in one of these frameworks but not the others. 00:04:49.370 --> 00:04:55.460 And so even if that framework is for some definition of inferior inferior, 00:04:55.460 --> 00:04:58.760 that might not necessarily be an overriding concern 00:04:58.760 --> 00:05:02.240 if you can actually build your MVP or your prototype 00:05:02.240 --> 00:05:05.420 faster with that particular framework because you know it already. 00:05:05.420 --> 00:05:07.580 Then if you could build it a little bit better, 00:05:07.580 --> 00:05:11.010 quote unquote, in quotes and so far as reasonable people 00:05:11.010 --> 00:05:14.310 can disagree as to what's inferior or superior in this world, 00:05:14.310 --> 00:05:16.740 then if you were to design it using a completely 00:05:16.740 --> 00:05:20.470 new framework for which there's just a non-trivial learning curve for you. 00:05:20.470 --> 00:05:22.746 And so there's, as in the case of data structures, 00:05:22.746 --> 00:05:25.620 as in the case of algorithms, as in the case of computer science more 00:05:25.620 --> 00:05:29.820 generally, there's these tradeoffs, and human time, developer time, 00:05:29.820 --> 00:05:32.850 learning time is certainly one of the resources 00:05:32.850 --> 00:05:37.170 that you have to decide how much of which you want to spend upfront. 00:05:37.170 --> 00:05:39.420 Meanwhile, in the world of CSS, there are also 00:05:39.420 --> 00:05:44.100 libraries there, collections of CSS files and frameworks 00:05:44.100 --> 00:05:47.910 really, methodologies for which you lay-- via which you lay out your site, 00:05:47.910 --> 00:05:51.760 like Bootstrap, Foundation, Semantic UI, and more, 00:05:51.760 --> 00:05:56.220 and these focus more so on the aesthetics of a user's experience, 00:05:56.220 --> 00:05:58.980 more so on the presentation of information 00:05:58.980 --> 00:06:03.780 and the types of user interface mechanisms, the buttons, the menus, 00:06:03.780 --> 00:06:06.450 the windows, and the like that a user might see on the screen. 00:06:06.450 --> 00:06:10.890 But here, too, there are so many different wheels 00:06:10.890 --> 00:06:12.420 that have been invented in the past. 00:06:12.420 --> 00:06:14.850 So many different people have decided, you know what? 00:06:14.850 --> 00:06:17.550 That default link on a web page could look much prettier. 00:06:17.550 --> 00:06:21.150 Or that button on a web page could look much better if you used my design. 00:06:21.150 --> 00:06:22.530 And so this is what's happened. 00:06:22.530 --> 00:06:27.630 The world has created and shared with others in the world various files 00:06:27.630 --> 00:06:32.430 that you, either in the context of JavaScript or in CSS or beyond, 00:06:32.430 --> 00:06:35.320 can integrate into your own projects. 00:06:35.320 --> 00:06:37.640 So how to even begin to vet these kinds of things, 00:06:37.640 --> 00:06:39.390 particularly since in a class like this we 00:06:39.390 --> 00:06:41.760 won't go into the weeds of evaluating these 00:06:41.760 --> 00:06:45.130 and even then we might not reach any sort of consensus. 00:06:45.130 --> 00:06:47.780 So the reality is typically relying on the engineers 00:06:47.780 --> 00:06:50.530 with whom you're working is first and foremost the place to start. 00:06:50.530 --> 00:06:51.363 What do people know? 00:06:51.363 --> 00:06:52.690 What are they comfortable with? 00:06:52.690 --> 00:06:53.520 What did they like? 00:06:53.520 --> 00:06:55.200 What they dislike about some framework? 00:06:55.200 --> 00:06:57.060 Did it actually speed up the work? 00:06:57.060 --> 00:06:58.680 Did it slow down the work? 00:06:58.680 --> 00:07:03.940 Did it create-- did it build up technical debt for them so to speak? 00:07:03.940 --> 00:07:08.100 For instance, just because something is easy and quick to get started 00:07:08.100 --> 00:07:13.440 with from the get go, is it so easy because it's riddled with poor design 00:07:13.440 --> 00:07:16.470 decisions such that as you get more and more users, 00:07:16.470 --> 00:07:20.430 maybe your application or your website's going to be slower and slower? 00:07:20.430 --> 00:07:23.010 Or maybe it's going to become harder to maintain, 00:07:23.010 --> 00:07:26.040 or it's going to be harder to onboard new people altogether. 00:07:26.040 --> 00:07:27.660 There's various trade offs there. 00:07:27.660 --> 00:07:29.776 And so considering what is optimal now, what 00:07:29.776 --> 00:07:32.400 is optimal in the medium term, what is optimal in the long term 00:07:32.400 --> 00:07:35.780 should perhaps be part of that whole conversation. 00:07:35.780 --> 00:07:39.510 Meanwhile, it's certainly a compelling thing from professional development 00:07:39.510 --> 00:07:42.760 perspective, for keeping things fresh, to actually go and learn something new. 00:07:42.760 --> 00:07:45.316 And so certainly punctuating one's experience in tech, 00:07:45.316 --> 00:07:47.940 should there be an opportunity to both pick up some new skills, 00:07:47.940 --> 00:07:53.040 to familiarize oneself with the latest and greatest and not necessarily change 00:07:53.040 --> 00:07:55.560 direction with each and every fad but generally 00:07:55.560 --> 00:07:59.050 be familiar with some of the trends in industry. 00:07:59.050 --> 00:08:01.200 And there's a bunch of ways with which to do that. 00:08:01.200 --> 00:08:03.900 I mean one, certainly relying on Google and other search engines 00:08:03.900 --> 00:08:06.510 just to get a sense of what the most popular hits are 00:08:06.510 --> 00:08:10.230 or search results when you search for something like popular JavaScript 00:08:10.230 --> 00:08:13.230 framework or some such search string like that. 00:08:13.230 --> 00:08:17.837 Looking on websites like Hacker News from Y Combinator, where there's 00:08:17.837 --> 00:08:20.170 an active community of folks from the startup community, 00:08:20.170 --> 00:08:23.130 especially talking about these kinds of technical decisions 00:08:23.130 --> 00:08:25.130 and design decisions more generally. 00:08:25.130 --> 00:08:28.440 Websites like Quora or other Q&A websites. 00:08:28.440 --> 00:08:32.220 Looking at GitHub.com, a popular web site where people store their code 00:08:32.220 --> 00:08:35.370 and can actually follow or star other people's 00:08:35.370 --> 00:08:39.630 repositories of code from which you can infer a sense of popularity 00:08:39.630 --> 00:08:43.740 based on how many people are following a framework x or y or z. 00:08:43.740 --> 00:08:46.560 But this is always a moving target, and so it's simply 00:08:46.560 --> 00:08:49.350 part of the conversation to have from the get go. 00:08:49.350 --> 00:08:51.870 And you're not necessarily going to regret a decision 00:08:51.870 --> 00:08:54.600 if you don't necessarily pick the most trendy 00:08:54.600 --> 00:08:59.250 or the one that's poised to take over all others 00:08:59.250 --> 00:09:00.840 because this is a fast changing world. 00:09:00.840 --> 00:09:03.330 And, in fact, one of the most frustrating if not expensive 00:09:03.330 --> 00:09:06.510 aspects of this world is just how quickly it changes. 00:09:06.510 --> 00:09:09.780 And so what you design today might not be what you design tomorrow, 00:09:09.780 --> 00:09:12.940 but that's also part of the excitement of this space. 00:09:12.940 --> 00:09:18.060 So with that said, that's just a glance at what the front end design process 00:09:18.060 --> 00:09:19.740 or decision process might be like. 00:09:19.740 --> 00:09:23.520 Let's take a look now at the back end, at least in the context of languages. 00:09:23.520 --> 00:09:26.880 So here you have an even longer list because at least in the front end 00:09:26.880 --> 00:09:29.580 world, recall that the de facto standard is 00:09:29.580 --> 00:09:33.570 to use JavaScript in the user facing web browser experience, 00:09:33.570 --> 00:09:38.490 but on the back end on the servers from which the HTML and the CSS and even 00:09:38.490 --> 00:09:42.900 the JavaScript are ultimately coming, you have so many more design decisions. 00:09:42.900 --> 00:09:47.040 So you have languages like Go and Java, JavaScript, .NET, PHP, Python Ruby, 00:09:47.040 --> 00:09:48.850 Scala, and so many others. 00:09:48.850 --> 00:09:52.920 These are perhaps just a few of the most popular these days. 00:09:52.920 --> 00:09:55.020 And all of them have their pluses and minuses. 00:09:55.020 --> 00:09:57.810 All of them have their supporters and the detractors. 00:09:57.810 --> 00:10:00.510 And all of them have folks who already know them 00:10:00.510 --> 00:10:04.500 or who might have to learn them among engineers with whom you might work. 00:10:04.500 --> 00:10:08.810 Meanwhile, though, those languages out of the box, so to speak, 00:10:08.810 --> 00:10:13.796 don't necessarily make designing a web-based application easy or as easy 00:10:13.796 --> 00:10:14.420 as it could be. 00:10:14.420 --> 00:10:18.290 They don't necessarily make building a mobile application as easy as it could 00:10:18.290 --> 00:10:21.320 be, or even if it is relatively easy, humans 00:10:21.320 --> 00:10:25.370 have found over time that, gosh, every time I build a mobile application, 00:10:25.370 --> 00:10:28.850 I'm like copying and pasting dozens or hundreds of lines of code 00:10:28.850 --> 00:10:31.460 because they all share a common framework or maybe 00:10:31.460 --> 00:10:35.120 a common meaning system or a common set of functionality. 00:10:35.120 --> 00:10:39.260 And so in this world to have libraries of reusable code built 00:10:39.260 --> 00:10:43.700 up and frameworks, libraries of code and methodologies via which you're 00:10:43.700 --> 00:10:46.430 building your applications, arisen. 00:10:46.430 --> 00:10:51.540 Among them Django, Flask, Laravel, .NET, Node.js, Rails, and the like, 00:10:51.540 --> 00:10:54.710 .NET being up there, too, because it generally refers to a set of languages 00:10:54.710 --> 00:10:59.880 you might use as well as the framework that oversees those various languages. 00:10:59.880 --> 00:11:03.080 And there's even more options ahead of you here. 00:11:03.080 --> 00:11:07.970 So how do you begin to pick among these options as well? 00:11:07.970 --> 00:11:12.680 Well, here, too, you're often guided by what your engineering team knows, 00:11:12.680 --> 00:11:15.800 perhaps what your own system administrators 00:11:15.800 --> 00:11:18.560 or your operational people know, so the folks who were actually 00:11:18.560 --> 00:11:21.050 maintaining the servers, whether they're locally on site, 00:11:21.050 --> 00:11:23.711 maybe they are the ones running things in the cloud, 00:11:23.711 --> 00:11:26.960 whether that's Amazon's or Google's or Microsoft's cloud or some other company 00:11:26.960 --> 00:11:30.440 still, depending on what that cloud infrastructure supports, 00:11:30.440 --> 00:11:34.820 might influence your decision making as to what language you might use. 00:11:34.820 --> 00:11:37.400 Sometimes the nature of your application might 00:11:37.400 --> 00:11:39.260 influence the language you might use. 00:11:39.260 --> 00:11:42.500 For instance, some of these languages make it a little bit easier 00:11:42.500 --> 00:11:45.950 to make real-time applications, applications 00:11:45.950 --> 00:11:50.090 that support chat servers or immediate interactivity, 00:11:50.090 --> 00:11:53.330 where there's a constant connection or the illusion of a constant connection 00:11:53.330 --> 00:11:54.890 between browser and server. 00:11:54.890 --> 00:11:57.170 PHP doesn't really make that all that easy. 00:11:57.170 --> 00:12:00.170 You can do it, but it wasn't really designed with that use case in mind. 00:12:00.170 --> 00:12:04.040 By contrast, JavaScript, via framework called Node.js, 00:12:04.040 --> 00:12:07.370 makes it really easy to do, and it was designed more so 00:12:07.370 --> 00:12:08.870 with that kind of use case in mind. 00:12:08.870 --> 00:12:13.100 And so here, too, you see hints of why some of these languages and in terms 00:12:13.100 --> 00:12:18.590 frameworks have arisen because they are actual solutions to concrete problems 00:12:18.590 --> 00:12:20.677 people have experienced in the past. 00:12:20.677 --> 00:12:22.760 And some of these languages are newer than others, 00:12:22.760 --> 00:12:25.140 and so they might come with more features 00:12:25.140 --> 00:12:27.980 so you don't have to rely as much on third-party libraries. 00:12:27.980 --> 00:12:32.420 Some of them less vetted or maybe less robust or even less secure than what 00:12:32.420 --> 00:12:34.910 comes with the language itself. 00:12:34.910 --> 00:12:39.390 And recall, too, that these languages are often constantly evolving, 00:12:39.390 --> 00:12:42.710 some more quickly than others, but there are new versions 00:12:42.710 --> 00:12:44.210 of these languages coming out. 00:12:44.210 --> 00:12:47.630 And so even within the confines of a given language like Java 00:12:47.630 --> 00:12:51.620 might there be new and improved features every year, every couple of years. 00:12:51.620 --> 00:12:54.230 And so sometimes actually picking a version 00:12:54.230 --> 00:12:59.220 of these languages or frameworks is one of the design decisions to bear. 00:12:59.220 --> 00:13:02.780 And I would say from a non-technical perspective, most compelling is just 00:13:02.780 --> 00:13:05.930 to be aware of these kinds of technologies, 00:13:05.930 --> 00:13:11.120 these kinds of buzzwords du jour, but also aware of these kinds of tradeoffs. 00:13:11.120 --> 00:13:13.010 It's not necessary to get into the weeds I 00:13:13.010 --> 00:13:16.010 think of understanding each and every language and what it's good for. 00:13:16.010 --> 00:13:18.470 Although once you have a general understanding 00:13:18.470 --> 00:13:21.920 of this world, of programming, of server side architecture, 00:13:21.920 --> 00:13:26.870 of HTTP, and web pages and the like, can you via Google and other websites 00:13:26.870 --> 00:13:29.480 I think start to wrap your mind around some of the tradeoffs 00:13:29.480 --> 00:13:32.210 and perhaps even start to tease apart which 00:13:32.210 --> 00:13:35.600 are technically compelling arguments that you might see online versus just 00:13:35.600 --> 00:13:38.990 religious objections to this language or that because that's often 00:13:38.990 --> 00:13:41.900 the case when folks get into heated discussions of language 00:13:41.900 --> 00:13:43.940 choices for instance. 00:13:43.940 --> 00:13:47.780 But I think ultimately understanding the tradeoffs, the onboarding time 00:13:47.780 --> 00:13:52.130 or the learning curve for various languages, the appropriateness 00:13:52.130 --> 00:13:54.350 of language for certain specific use cases 00:13:54.350 --> 00:13:59.180 like the real-time chat applications or whatever your own product happens 00:13:59.180 --> 00:14:02.400 to be, what your engineers already know what they're good at, 00:14:02.400 --> 00:14:05.780 what they prefer to use what language and framework is easiest 00:14:05.780 --> 00:14:10.490 for new hires, maybe six months hence or two years and to actually come on board 00:14:10.490 --> 00:14:14.540 and understand so that you're not expecting the most experienced 00:14:14.540 --> 00:14:16.924 of new hires to constantly be in your pipeline. 00:14:16.924 --> 00:14:20.090 So appreciating these kinds of tradeoffs and asking these kinds of questions 00:14:20.090 --> 00:14:23.600 even among the engineers that are perhaps making the decision ultimately 00:14:23.600 --> 00:14:26.060 is a valuable way to contribute to the conversation 00:14:26.060 --> 00:14:29.390 and take some comfort in the fact that your product need not 00:14:29.390 --> 00:14:31.310 be a complete black box. 00:14:31.310 --> 00:14:34.580 You might not necessarily be able to implement it from scratch yourself, 00:14:34.580 --> 00:14:36.620 but you can at least ask the right questions 00:14:36.620 --> 00:14:40.460 and be a sounding board for some of the answers that come back. 00:14:40.460 --> 00:14:43.760 Now, there are some fundamental differences 00:14:43.760 --> 00:14:47.570 and some of these architectural decisions among which 00:14:47.570 --> 00:14:49.490 are around choice of database. 00:14:49.490 --> 00:14:54.860 Indeed, most any web application today has a back end database inside 00:14:54.860 --> 00:14:59.030 of which is stored data from users, whether its purchase orders or user 00:14:59.030 --> 00:15:01.487 registrations and passwords and any amount of data that's 00:15:01.487 --> 00:15:04.070 collected from users at the end of the day is stored somewhere 00:15:04.070 --> 00:15:06.260 and that somewhere is called a database. 00:15:06.260 --> 00:15:08.590 But there's different types of databases. 00:15:08.590 --> 00:15:14.930 Two of the biggest categories these days or perhaps SQL and the opposite NoSQL, 00:15:14.930 --> 00:15:19.120 as it's playfully called, SQL being structured query language and NoSQL 00:15:19.120 --> 00:15:21.640 referring to a class of databases that doesn't support SQL 00:15:21.640 --> 00:15:23.920 and indeed is not generally relational. 00:15:23.920 --> 00:15:25.660 And we'll soon see what that means. 00:15:25.660 --> 00:15:29.020 But even with this world, do you have a veritable menu of options 00:15:29.020 --> 00:15:32.950 MariaDB mySQL, Oracle, PostgreSQL Server. 00:15:32.950 --> 00:15:34.825 And then within those-- 00:15:34.825 --> 00:15:37.867 within that relational world do you also have the cont-- 00:15:37.867 --> 00:15:39.700 in addition to that relational world, do you 00:15:39.700 --> 00:15:43.270 have the contrast of the object-oriented or document store 00:15:43.270 --> 00:15:47.770 world, things like Bigtable, Cassandra, HBase, MongoDB, and others. 00:15:47.770 --> 00:15:50.500 And already it can be sort of overwhelming to feel like just 00:15:50.500 --> 00:15:54.010 as you're getting up to speed on what the web is and how web pages work, 00:15:54.010 --> 00:15:55.240 oh my god! 00:15:55.240 --> 00:15:57.900 We're just beginning to make our decisions. 00:15:57.900 --> 00:15:59.650 But generally these decisions, too, can be 00:15:59.650 --> 00:16:02.192 guided by what your team knows, what you're comfortable with, 00:16:02.192 --> 00:16:03.941 what the price might be for some of these. 00:16:03.941 --> 00:16:05.800 And some of these are free and open source. 00:16:05.800 --> 00:16:08.600 Some of them have commercial licenses associated with them. 00:16:08.600 --> 00:16:12.670 Some of them are supported easily for you with your cloud provider, 00:16:12.670 --> 00:16:15.920 wherever you're hosting your servers or you might have to host them yourself. 00:16:15.920 --> 00:16:18.619 So you can begin to narrow the field of options. 00:16:18.619 --> 00:16:21.160 And indeed, especially when building multiple products, might 00:16:21.160 --> 00:16:23.410 you build on past experience of yourself. 00:16:23.410 --> 00:16:26.800 So, for instance, for the course and all of our web-based applications, 00:16:26.800 --> 00:16:30.160 we tend to use a lot of the same technologies 00:16:30.160 --> 00:16:33.610 and only recently have we begun to transition from one main language 00:16:33.610 --> 00:16:36.430 to another but doing it pretty much for all of our applications 00:16:36.430 --> 00:16:41.590 across the board so that we don't have to worry about some of the team members 00:16:41.590 --> 00:16:42.980 knowing x and y and z. 00:16:42.980 --> 00:16:46.450 It's just there's an economy to scale to focusing on relatively 00:16:46.450 --> 00:16:48.700 fewer technologies internally. 00:16:48.700 --> 00:16:53.920 But let's dive in a little deeper into SQL and NoSQL if only because they're 00:16:53.920 --> 00:16:57.740 so cleanly bucketized into SQL and not-SQL really. 00:16:57.740 --> 00:17:00.700 What do we mean by this, and what does a database really do? 00:17:00.700 --> 00:17:05.589 So a database typically supports these-- at least these four operations 00:17:05.589 --> 00:17:07.990 or categories of operations playfully called 00:17:07.990 --> 00:17:14.400 CRUD, which stands for create, read, update, and delete, 00:17:14.400 --> 00:17:17.200 though you might see some variations on what the actual words are. 00:17:17.200 --> 00:17:21.099 But CRUD refers to those four fundamental operations. 00:17:21.099 --> 00:17:25.599 Now, in the world of SQL, or S-Q-L, structured query language, 00:17:25.599 --> 00:17:27.970 which itself is a programming language, and it's 00:17:27.970 --> 00:17:33.310 a programming language you use to query a database, to add data to a database, 00:17:33.310 --> 00:17:35.740 remove it, edit it, and the like. 00:17:35.740 --> 00:17:39.190 Within the world of SQL, there are-- 00:17:39.190 --> 00:17:41.620 is a direct mapping of these four operations, 00:17:41.620 --> 00:17:45.160 the four keywords, for features of the SQL language, 00:17:45.160 --> 00:17:48.280 namely create, select, update, and delete. 00:17:48.280 --> 00:17:51.250 So it's almost CRUD, but it doesn't quite line up perfectly. 00:17:51.250 --> 00:17:55.720 So create, read, update, delete is the general notion of the four operations 00:17:55.720 --> 00:17:59.320 database might support, and in the world of SQL, which we're about to dive into, 00:17:59.320 --> 00:18:04.120 might you see these four commands specifically create, select, update, 00:18:04.120 --> 00:18:05.290 and delete. 00:18:05.290 --> 00:18:08.410 So what does it mean to be a SQL database, or more 00:18:08.410 --> 00:18:11.110 generally, what does it mean to be a relational database? 00:18:11.110 --> 00:18:14.350 Because a relational database is often historically 00:18:14.350 --> 00:18:17.680 what people think of when they think of databases and only in recent years 00:18:17.680 --> 00:18:20.980 has this NoSQL trend been catching up that changes the paradigm. 00:18:20.980 --> 00:18:24.230 And we'll look at the flip side in just a moment. 00:18:24.230 --> 00:18:27.310 So if you've ever seen this, whether it's this version of Excel 00:18:27.310 --> 00:18:31.990 or some equivalent version of Numbers or Google Spreadsheets or the like, 00:18:31.990 --> 00:18:35.990 this is kind of a relational database. 00:18:35.990 --> 00:18:39.160 It is a piece of software that allows you to lay out 00:18:39.160 --> 00:18:41.650 your data in rows and columns. 00:18:41.650 --> 00:18:45.850 And among those rows and columns are there typically relationships. 00:18:45.850 --> 00:18:51.010 Consider after all the last time you used a spreadsheet, if ever, odds 00:18:51.010 --> 00:18:54.040 are there was some kind of meaning if you put data 00:18:54.040 --> 00:18:59.150 in column A versus B versus C versus D. In other words, 00:18:59.150 --> 00:19:02.500 when adding data to a spreadsheet, typically if you're using it correctly, 00:19:02.500 --> 00:19:05.740 you don't just start plopping your data in any random box that doesn't yet 00:19:05.740 --> 00:19:08.290 have a number or a word in it. 00:19:08.290 --> 00:19:11.500 You generally organize the data such that in column A 00:19:11.500 --> 00:19:13.570 might be one type of data, column B might 00:19:13.570 --> 00:19:16.340 be another type of data, and so forth. 00:19:16.340 --> 00:19:20.890 And so it's relational in the sense that the numbers and the data within 00:19:20.890 --> 00:19:22.090 relate to one another. 00:19:22.090 --> 00:19:25.720 And it's also relational in the sense that you can have multiple sheets 00:19:25.720 --> 00:19:26.700 even within a file. 00:19:26.700 --> 00:19:30.250 So by default with Excel and Numbers and Google Sheets do you get just one sheet 00:19:30.250 --> 00:19:33.190 or worksheet by default. But if you ever want 00:19:33.190 --> 00:19:36.400 to have multiple types of data but all in the same file 00:19:36.400 --> 00:19:39.250 just because it's kind of nice and orderly to keep it all together, 00:19:39.250 --> 00:19:42.310 you can click the plus and create a new sheet 00:19:42.310 --> 00:19:45.610 and have a completely different tabular structure, a different number of rows 00:19:45.610 --> 00:19:48.160 and columns and different meanings for those columns 00:19:48.160 --> 00:19:51.520 but somehow the data is all related, ergo 00:19:51.520 --> 00:19:53.390 this notion of a relational database. 00:19:53.390 --> 00:19:56.800 So a relational database stores data in tables, 00:19:56.800 --> 00:20:01.670 and a table is in turn a set of rows and columns. 00:20:01.670 --> 00:20:04.700 So why does this actually matter? 00:20:04.700 --> 00:20:09.880 Well, Excel is not all that powerful when it comes large datasets. 00:20:09.880 --> 00:20:13.540 In, fact it wasn't all that long ago that Excel I believe only 00:20:13.540 --> 00:20:22.250 supported as many as 65,536 or 35 rows probably, 00:20:22.250 --> 00:20:24.000 and that's because, long story short, they 00:20:24.000 --> 00:20:28.750 used the 16-bit integer, the biggest number for which is 65,535, 00:20:28.750 --> 00:20:35.650 and so Excel physically couldn't count as high as 65,536 or 7 or 8 00:20:35.650 --> 00:20:38.530 because they just didn't use enough storage underneath the hood. 00:20:38.530 --> 00:20:42.220 But even if you had that much data, and that's quite a lot of rows, 00:20:42.220 --> 00:20:44.110 the software just tended to be super slow 00:20:44.110 --> 00:20:46.002 at least in my own experience back in the day 00:20:46.002 --> 00:20:47.710 trying to manipulate very large datasets. 00:20:47.710 --> 00:20:51.700 And Excel just wasn't designed for tens of thousands of rows. 00:20:51.700 --> 00:20:53.740 By the time you're at that much data, you 00:20:53.740 --> 00:20:58.240 should really be graduating, so to speak, to an actual relational database 00:20:58.240 --> 00:21:03.220 management system, a server-driven database that actually leverages not 00:21:03.220 --> 00:21:06.820 just files but memory more effectively. 00:21:06.820 --> 00:21:09.070 In fact, what-- among the features you get 00:21:09.070 --> 00:21:13.810 from products like MariaDB and MySQL and Oracle and Postgres and the like 00:21:13.810 --> 00:21:17.950 is you get really smart people who have implemented the software in such a way 00:21:17.950 --> 00:21:22.750 that it makes your creates and your reads and your updates and your deletes 00:21:22.750 --> 00:21:26.770 faster than they might be if you were just storing all of your data 00:21:26.770 --> 00:21:27.910 in a big file. 00:21:27.910 --> 00:21:30.790 For instance, in a big file like Excel, if you 00:21:30.790 --> 00:21:33.880 want to search for some information, you can hit Command F or Control F 00:21:33.880 --> 00:21:38.230 or whatever, type in a keyword, and then Excel or Numbers or Sheets 00:21:38.230 --> 00:21:39.520 will search for it. 00:21:39.520 --> 00:21:43.600 But generally these spreadsheet programs are going to search for your data 00:21:43.600 --> 00:21:46.100 pretty much by a brute force, search top to bottom, 00:21:46.100 --> 00:21:48.782 left to right, looking in every darn cell for that data. 00:21:48.782 --> 00:21:50.990 That's fine if you've got a pretty small spreadsheet. 00:21:50.990 --> 00:21:53.230 We slow humans aren't going to notice the difference. 00:21:53.230 --> 00:21:56.710 But in the context of really big datasets, tens of thousands 00:21:56.710 --> 00:21:59.500 of rows, let alone millions or billions, it 00:21:59.500 --> 00:22:01.660 does not suffice to look at every piece of data 00:22:01.660 --> 00:22:07.150 when looking for a certain phrase or a certain number or some such value. 00:22:07.150 --> 00:22:13.030 You want the database itself to do some anticipatory optimization, sort 00:22:13.030 --> 00:22:16.600 of working its magic underneath the hood using various algorithms and data 00:22:16.600 --> 00:22:19.570 structures, so as to optimize those queries 00:22:19.570 --> 00:22:22.360 and to give me answers in logarithmic time, 00:22:22.360 --> 00:22:28.070 not linear time, or time that's faster than searching the whole darn thing. 00:22:28.070 --> 00:22:31.540 So at some point, spreadsheet software does not cut it, 00:22:31.540 --> 00:22:35.020 and you transition to a more proper relational database. 00:22:35.020 --> 00:22:38.050 But moreover at that point, you have to start 00:22:38.050 --> 00:22:41.350 deciding how you want the database to store your data. 00:22:41.350 --> 00:22:43.750 Because at the end of the day, we humans generally 00:22:43.750 --> 00:22:48.550 know a little more about the programs we write about the data 00:22:48.550 --> 00:22:50.530 we're going to be storing. 00:22:50.530 --> 00:22:54.820 And by this I mean, if I am storing a bunch of data in a database, 00:22:54.820 --> 00:22:59.072 I probably know better than the computer might know which of these values 00:22:59.072 --> 00:23:01.780 is always going to be like an integer or which of these is always 00:23:01.780 --> 00:23:04.330 going to be a dollar amount or which phrase is always 00:23:04.330 --> 00:23:07.660 going to look like a time of day or a date or day 00:23:07.660 --> 00:23:09.970 of the week or some other such value. 00:23:09.970 --> 00:23:13.870 And so we humans can actually help databases help us 00:23:13.870 --> 00:23:17.750 be more highly performing by providing them with hints, 00:23:17.750 --> 00:23:22.480 otherwise known as data types, that tell the database what type of data to store 00:23:22.480 --> 00:23:26.380 and therefore how to store it most efficiently. 00:23:26.380 --> 00:23:30.110 Some of those popular data types in the world of SQL then are these, 00:23:30.110 --> 00:23:32.020 and let's just take a look at a few of these. 00:23:32.020 --> 00:23:33.730 So char and varchar. 00:23:33.730 --> 00:23:36.820 So char being shorthand for character, and it's not 00:23:36.820 --> 00:23:39.690 a single character like a or b or c. 00:23:39.690 --> 00:23:43.510 Character, or char, generally refers to a column 00:23:43.510 --> 00:23:48.040 in a database that is going to store one or more 00:23:48.040 --> 00:23:51.320 characters a little confusingly, a string, so to speak, 00:23:51.320 --> 00:23:54.380 where a string is a sequence of 0 or more characters. 00:23:54.380 --> 00:23:57.340 So when designing a database column that you know 00:23:57.340 --> 00:24:01.210 is going to contain a word or a sentence or even a paragraph, 00:24:01.210 --> 00:24:03.250 you can tell the database, hey, database, 00:24:03.250 --> 00:24:07.690 make this column this many characters wide, i.e. 00:24:07.690 --> 00:24:10.000 allocate that much data upfront. 00:24:10.000 --> 00:24:12.520 But if you're not sure, as might often be the case-- 00:24:12.520 --> 00:24:14.230 maybe someone has a short name. 00:24:14.230 --> 00:24:16.300 Maybe someone has a long name. 00:24:16.300 --> 00:24:18.970 Maybe someone has a long address or a short address. 00:24:18.970 --> 00:24:21.760 If you don't really know what the right length is 00:24:21.760 --> 00:24:25.240 for a column for the values a user is going to provide, 00:24:25.240 --> 00:24:29.980 you can instead use varchar for variable length character strings, which is 00:24:29.980 --> 00:24:32.570 to say you specify only an upper bound. 00:24:32.570 --> 00:24:36.680 So I don't know what the longest name is in the whole world. 00:24:36.680 --> 00:24:40.660 But my name is D-a-v-i-d, five feels like it's kind of short. 00:24:40.660 --> 00:24:43.000 Probably some people with longer names in the world. 00:24:43.000 --> 00:24:44.079 20, is that enough? 00:24:44.079 --> 00:24:44.620 I don't know? 00:24:44.620 --> 00:24:45.340 50? 00:24:45.340 --> 00:24:46.280 I don't know, 100? 00:24:46.280 --> 00:24:47.222 Probably. 00:24:47.222 --> 00:24:49.930 I should probably Google to find out with a bit more reassurance, 00:24:49.930 --> 00:24:55.410 but this is a decision that the web designer or the database designer 00:24:55.410 --> 00:24:56.512 is going to have to make. 00:24:56.512 --> 00:24:58.720 You can't just tell, and you don't want to just tell, 00:24:58.720 --> 00:25:04.309 the database accept any length string because the more flexible 00:25:04.309 --> 00:25:07.600 you expect the database to be, the more generous you expect the database to be, 00:25:07.600 --> 00:25:10.450 the less optimization it can do for you. 00:25:10.450 --> 00:25:14.980 By contrast, the more precise you can be, the more conservative you can be, 00:25:14.980 --> 00:25:19.080 the more optimization algorithmically the database can 00:25:19.080 --> 00:25:23.010 do so that when you ask for data back, it can give you those answers faster. 00:25:23.010 --> 00:25:25.480 When you insert data, it can insert it faster. 00:25:25.480 --> 00:25:28.350 So the more helpful we humans can be with our databases, 00:25:28.350 --> 00:25:30.160 the more help the database can be in turn, 00:25:30.160 --> 00:25:33.368 and that's probably a good thing when we have lots and lots of data and users 00:25:33.368 --> 00:25:38.130 because we want the common case to be highly performing. 00:25:38.130 --> 00:25:41.790 It might cost me a minute, five minutes upfront to really noodle on the problem 00:25:41.790 --> 00:25:43.560 and figure out what the best design is. 00:25:43.560 --> 00:25:46.830 But that cost is going to be amortized over thousands 00:25:46.830 --> 00:25:50.430 of users, millions of users, who are then benefiting thereafter 00:25:50.430 --> 00:25:52.680 from a better database design. 00:25:52.680 --> 00:25:54.320 So where is the line to be drawn? 00:25:54.320 --> 00:25:58.260 We'll explore this in the context of an example, but it kind of depends. 00:25:58.260 --> 00:26:00.900 There is no right answer necessarily. 00:26:00.900 --> 00:26:05.250 It really depends on your use case and the data you're trying to store. 00:26:05.250 --> 00:26:08.660 With numbers, too, do you have some discretion. 00:26:08.660 --> 00:26:11.542 Integer means what it is, generally a 32-bit value, which 00:26:11.542 --> 00:26:13.750 means you can have a number from negative two billion 00:26:13.750 --> 00:26:16.070 to positive two billion, give or take. 00:26:16.070 --> 00:26:17.210 But that might be overkill. 00:26:17.210 --> 00:26:19.460 If you know you're dealing with really small integers, 00:26:19.460 --> 00:26:21.500 maybe you don't need 32 bits. 00:26:21.500 --> 00:26:25.094 Maybe you want fewer and so you might just say, small int. 00:26:25.094 --> 00:26:26.510 Doesn't need to be that many bits. 00:26:26.510 --> 00:26:28.534 I know my values aren't going to get that large. 00:26:28.534 --> 00:26:30.200 I might as well save the database space. 00:26:30.200 --> 00:26:32.210 Or by contrast, wait a minute. 00:26:32.210 --> 00:26:36.560 Going to have more than two billion users, success permitting. 00:26:36.560 --> 00:26:39.790 I'm going to, therefore, want to use a big int like 64 bits, 00:26:39.790 --> 00:26:45.170 so I can have many, many, many, many users or rows in my database. 00:26:45.170 --> 00:26:47.930 And indeed some of the most popular websites out there 00:26:47.930 --> 00:26:49.142 have run into this issue. 00:26:49.142 --> 00:26:51.350 The Facebooks, YouTubes, and the others of the world, 00:26:51.350 --> 00:26:53.600 well, they just have so much darn data, they 00:26:53.600 --> 00:26:58.180 had better not cap the number of rows in their database table 00:26:58.180 --> 00:27:03.060 at only two billion because they might well have that many and more. 00:27:03.060 --> 00:27:05.820 Then why not just choose varchar with a really big number? 00:27:05.820 --> 00:27:09.600 Why not choose big int with a really big number of bits? 00:27:09.600 --> 00:27:12.090 Well, it's wasteful. 00:27:12.090 --> 00:27:15.660 You shouldn't over-allocate because then you're just spending more space, 00:27:15.660 --> 00:27:18.210 and space costs money and might even cost time 00:27:18.210 --> 00:27:20.560 to search if there's more bits to be looked at. 00:27:20.560 --> 00:27:24.600 And so you don't necessarily want to just cop out and say, use as much space 00:27:24.600 --> 00:27:29.340 as you want or as is necessary because, again, we can't be as helpful therefore 00:27:29.340 --> 00:27:31.210 to the database. 00:27:31.210 --> 00:27:35.400 Now, Numbers are an interesting one, and this is true in programming languages 00:27:35.400 --> 00:27:39.630 whether it's SQL or C or C++ or yet others. 00:27:39.630 --> 00:27:44.460 It turns out that choosing how big your data is, or anticipating it, 00:27:44.460 --> 00:27:47.730 has some real impact in some cases of numbers. 00:27:47.730 --> 00:27:54.900 So it turns out that a integer of course is just a number like negative 101 00:27:54.900 --> 00:27:57.100 and on up in both directions. 00:27:57.100 --> 00:28:00.690 But a floating point value or float is a real number, 00:28:00.690 --> 00:28:03.840 a number that's not necessarily an integer, but a real number that 00:28:03.840 --> 00:28:07.500 has a decimal point and some number of digits after that decimal point 00:28:07.500 --> 00:28:11.190 that may or may not be representable precisely as a fraction. 00:28:11.190 --> 00:28:14.670 So that's a real number or a float. 00:28:14.670 --> 00:28:16.637 If you want more bits or precision than that, 00:28:16.637 --> 00:28:19.470 you can actually specify double precision, which gives you more bits 00:28:19.470 --> 00:28:22.870 and therefore you can have even more digits after the decimal point. 00:28:22.870 --> 00:28:26.020 But the key takeaway here is that at the end of the day, 00:28:26.020 --> 00:28:29.050 it's going to be finite if you're representing a number. 00:28:29.050 --> 00:28:32.940 And so if you do use something like a float, even a double precision float, 00:28:32.940 --> 00:28:35.340 which gives you more bits of precision. 00:28:35.340 --> 00:28:38.700 At the end of the day, last I recall from grade school, 00:28:38.700 --> 00:28:43.050 there is an infinite number of numbers in the world, both integers 00:28:43.050 --> 00:28:44.620 and real numbers for that matter. 00:28:44.620 --> 00:28:48.960 So in both the case of these integer base numbers and these floating point 00:28:48.960 --> 00:28:52.530 values, you can only count so high, or you can only 00:28:52.530 --> 00:28:54.900 specify a number so precisely. 00:28:54.900 --> 00:28:58.020 And at the end of the day, you might have some overflow 00:28:58.020 --> 00:29:01.860 where you just can't represent bigger numbers, whether positive or negative, 00:29:01.860 --> 00:29:05.040 or you just can't represent enough decimal points-- 00:29:05.040 --> 00:29:09.330 enough numbers after the decimal point to represent a number 00:29:09.330 --> 00:29:11.980 perfectly accurately. 00:29:11.980 --> 00:29:13.290 And so there's this tradeoff. 00:29:13.290 --> 00:29:16.300 You might want more and more space, but at some point, 00:29:16.300 --> 00:29:18.599 you can have an infinite amount of space. 00:29:18.599 --> 00:29:19.890 Computers are physical devices. 00:29:19.890 --> 00:29:23.040 They only have a physical amount of memory inside of them. 00:29:23.040 --> 00:29:24.580 You might have to draw a line. 00:29:24.580 --> 00:29:27.840 And so if you've ever seen some older movies like Superman 3, 00:29:27.840 --> 00:29:31.260 which has a great incarnation of this or somewhat more recently, 00:29:31.260 --> 00:29:35.580 Office Space, where there's money making scam whereby the companies in question, 00:29:35.580 --> 00:29:41.182 long story short, were constantly manipulating monetary amounts 00:29:41.182 --> 00:29:43.140 in their database systems, but they were always 00:29:43.140 --> 00:29:45.090 rounding off fractions of pennies. 00:29:45.090 --> 00:29:48.540 And so the masterminds in both movies started 00:29:48.540 --> 00:29:51.040 pocketing all of those fractions of pennies, 00:29:51.040 --> 00:29:53.610 but hilariousness ensues when they don't quite 00:29:53.610 --> 00:29:56.070 realize how much those fractions of pennies add up. 00:29:56.070 --> 00:29:58.275 But that too is an issue of imprecision. 00:29:58.275 --> 00:30:02.610 We in the human world generally, when going to stores and such, 00:30:02.610 --> 00:30:05.650 use only two decimal points of precision. 00:30:05.650 --> 00:30:07.830 But investment banks and banks more generally 00:30:07.830 --> 00:30:10.140 might actually use more decimal point-- more numbers 00:30:10.140 --> 00:30:12.030 after the decimal point than that. 00:30:12.030 --> 00:30:16.740 And so having the ability of expressing numbers more precisely is compelling. 00:30:16.740 --> 00:30:20.460 Thankfully, there does exist Decimal, which 00:30:20.460 --> 00:30:24.060 allows you to specify how many numbers maximally you essentially 00:30:24.060 --> 00:30:29.250 want before and after the decimal point or the total number in question. 00:30:29.250 --> 00:30:31.480 And so that would be an alternative to these others. 00:30:31.480 --> 00:30:34.380 But it might end up then costing you more space just 00:30:34.380 --> 00:30:35.800 to get that more precision. 00:30:35.800 --> 00:30:38.010 So here, too, as with the decisions around frameworks 00:30:38.010 --> 00:30:40.710 and libraries and languages, here, too, there's a tradeoff. 00:30:40.710 --> 00:30:44.310 Even at this lower level, when you really get into it, deciding 00:30:44.310 --> 00:30:46.890 how to store your data in a database. 00:30:46.890 --> 00:30:51.630 Lastly, and a little more easily, there are data types like to Date and Time 00:30:51.630 --> 00:30:54.210 and Timestamp, which do as they say. 00:30:54.210 --> 00:30:55.230 They look like dates. 00:30:55.230 --> 00:30:56.105 They look like times. 00:30:56.105 --> 00:30:58.230 They look like timestamps, just some counter 00:30:58.230 --> 00:31:00.540 from some preordained moment in time. 00:31:00.540 --> 00:31:02.460 And these data types are commonly used as you 00:31:02.460 --> 00:31:05.580 might guess to store these types of data in a database. 00:31:05.580 --> 00:31:07.137 When did the user last log in? 00:31:07.137 --> 00:31:08.970 When did he or she register for the website? 00:31:08.970 --> 00:31:12.600 When did he or she buy something from our catalog or the like? 00:31:12.600 --> 00:31:15.040 You can represent those and more data types 00:31:15.040 --> 00:31:19.500 in a standard relational database that supports SQL. 00:31:19.500 --> 00:31:22.470 But you have some other options, too. 00:31:22.470 --> 00:31:24.570 It turns out that in a relational database, 00:31:24.570 --> 00:31:28.770 you can be even more helpful to the database by telling it in advance 00:31:28.770 --> 00:31:31.170 if any of your columns should be considered 00:31:31.170 --> 00:31:35.590 a primary key or a foreign key or a unique constraint. 00:31:35.590 --> 00:31:36.910 Now, what does this mean? 00:31:36.910 --> 00:31:41.400 Well, typically with data, it is useful to be 00:31:41.400 --> 00:31:47.984 able to uniquely identify a row in your table in your spreadsheets 00:31:47.984 --> 00:31:49.650 without having to look at the whole row. 00:31:49.650 --> 00:31:52.680 For instance, when using Excel or Numbers or Google Sheets, 00:31:52.680 --> 00:31:54.930 you'll notice that by default, all of the rows 00:31:54.930 --> 00:31:57.360 are just numbered 1 through whatever. 00:31:57.360 --> 00:32:01.040 That's useful because if you are collaborating with someone or you 00:32:01.040 --> 00:32:03.050 yourself are just trying to find some value, 00:32:03.050 --> 00:32:08.030 you could just jump ahead to like row 50 to identify the 50th row of your data. 00:32:08.030 --> 00:32:10.640 You don't have to look for a specific name 00:32:10.640 --> 00:32:13.490 or address or purchase order or whatever it 00:32:13.490 --> 00:32:15.440 is that you're storing in this table. 00:32:15.440 --> 00:32:17.720 You can just jump to the row number in question. 00:32:17.720 --> 00:32:20.540 A relational database very often takes the same approach, 00:32:20.540 --> 00:32:23.690 using some piece of data, usually just an integer 1, 2, 3, 4, 00:32:23.690 --> 00:32:26.870 just like the spreadsheet programs, to uniquely identify 00:32:26.870 --> 00:32:29.450 the rows so that you can access them very 00:32:29.450 --> 00:32:33.020 quickly via that number or that index. 00:32:33.020 --> 00:32:36.350 A foreign key, we'll see, is a notion of a piece 00:32:36.350 --> 00:32:39.830 of data that exists in two separate tables-- 00:32:39.830 --> 00:32:42.276 two sheets where there's an interrelationship 00:32:42.276 --> 00:32:44.150 but more on that kind of example in a moment. 00:32:44.150 --> 00:32:48.620 And a unique key, a unique column, is one where 00:32:48.620 --> 00:32:50.660 you should not see any duplicates. 00:32:50.660 --> 00:32:53.870 So, for instance, maybe when building a website that has users 00:32:53.870 --> 00:32:56.930 register for your website, if you want to ensure 00:32:56.930 --> 00:33:01.850 that no user can have the same email address as another, 00:33:01.850 --> 00:33:04.220 you can specify to your database, hey, database, make 00:33:04.220 --> 00:33:08.330 sure that Mayland@Harvard.edu, or whatever the user's email address is, 00:33:08.330 --> 00:33:11.720 only appears once in a column in my database. 00:33:11.720 --> 00:33:18.660 Don't let David or not-David register with that same email address. 00:33:18.660 --> 00:33:21.320 And so this is a useful way to ensure that you 00:33:21.320 --> 00:33:25.400 have correct behavior of your website and you have integrity of your data 00:33:25.400 --> 00:33:28.520 so that you don't accidentally have duplicate values, which 00:33:28.520 --> 00:33:30.320 would lead potentially to ambiguity. 00:33:30.320 --> 00:33:34.800 And there's even more features you might get from a typical database. 00:33:34.800 --> 00:33:38.540 So let's indeed now try an example whereby we decide 00:33:38.540 --> 00:33:41.210 how best to store data in my database. 00:33:41.210 --> 00:33:44.560 But to simulate my database I'm going to quite simply just use Excel here. 00:33:44.560 --> 00:33:46.310 I could use Apple Numbers or Google Sheets 00:33:46.310 --> 00:33:48.860 or the like or any spreadsheet program, but at the end of the day 00:33:48.860 --> 00:33:51.890 I'm really just using this because it's a program with rows and columns. 00:33:51.890 --> 00:33:56.510 In reality, if I am a business owner and I have a web-based store 00:33:56.510 --> 00:34:00.341 and I sell widgets and sprockets on my store, 00:34:00.341 --> 00:34:02.090 the reality is I want to keep track of who 00:34:02.090 --> 00:34:05.270 has bought what so I know what my revenue is, 00:34:05.270 --> 00:34:08.000 so I know to whom I need to ship things, and so forth. 00:34:08.000 --> 00:34:10.430 And so I'm going to pretend to be the database 00:34:10.430 --> 00:34:14.510 here so that we can walk through an example where we design this database 00:34:14.510 --> 00:34:17.840 but realize that the actual data that's being inputted by the user 00:34:17.840 --> 00:34:23.030 into my website's front end, the HTML, JavaScript, and CSS user interface, 00:34:23.030 --> 00:34:26.840 is going to get sent to the server, as by an HTML form, 00:34:26.840 --> 00:34:31.699 where my back end language, whether it's Python or PHP or Java or Ruby 00:34:31.699 --> 00:34:33.710 or the like with some framework probably, 00:34:33.710 --> 00:34:37.080 is going to be ultimately storing it in a database. 00:34:37.080 --> 00:34:40.909 And that database in turn might be my SQL or MariaDB or Oracle or Postgres 00:34:40.909 --> 00:34:42.469 or something else. 00:34:42.469 --> 00:34:46.610 We're just going to focus on what any of those databases might-- 00:34:46.610 --> 00:34:50.630 how any of those databases might potentially store the information. 00:34:50.630 --> 00:34:52.844 So someone has just submitted a form on our website. 00:34:52.844 --> 00:34:55.219 They've given their credit card information and the like, 00:34:55.219 --> 00:34:58.250 and therefore it is time for my website to store 00:34:58.250 --> 00:34:59.630 this information in a database. 00:34:59.630 --> 00:35:00.730 What am I going to store? 00:35:00.730 --> 00:35:05.420 Well if they've bought a widget, I might type in widgets, quantity 1, 00:35:05.420 --> 00:35:09.530 and maybe it was Zamyla Chan who bought this widget, 00:35:09.530 --> 00:35:13.760 and she is at the CS Building at 33 Oxford Street. 00:35:13.760 --> 00:35:19.370 And that's in Cambridge, and that's in Massachusetts in 02138, USA. 00:35:19.370 --> 00:35:23.210 So here is some information therefore that I might store. 00:35:23.210 --> 00:35:25.190 This is good because I know to whom to ship it. 00:35:25.190 --> 00:35:27.500 I know how many widgets I have sold. 00:35:27.500 --> 00:35:30.350 And maybe the price should be in there as well. 00:35:30.350 --> 00:35:35.630 So she paid maybe $9.99 for this widget. 00:35:35.630 --> 00:35:37.430 All right, now, let's fast forward in time, 00:35:37.430 --> 00:35:39.846 and let's assume that someone else has visited my website, 00:35:39.846 --> 00:35:43.530 and they too have decided to buy a widget but multiple widgets. 00:35:43.530 --> 00:35:45.050 We upsold them. 00:35:45.050 --> 00:35:47.550 So a widget was bought, quantity 2. 00:35:47.550 --> 00:35:53.720 This is, say, Rob Bouden also in 33 Oxford Street, Cambridge, Mass, 02138, 00:35:53.720 --> 00:35:59.630 USA, and this one was a total of $19.98 since he bought two of them. 00:35:59.630 --> 00:36:03.200 So I've just been storing this data in sort of freeform format 00:36:03.200 --> 00:36:06.360 but each of these columns clearly has meaning. 00:36:06.360 --> 00:36:10.430 So maybe this first column should really be called Product. 00:36:13.750 --> 00:36:16.020 This one should be called Quantity. 00:36:16.020 --> 00:36:18.090 This one could be called Name. 00:36:18.090 --> 00:36:19.890 This is maybe Street. 00:36:19.890 --> 00:36:21.360 This is maybe City. 00:36:21.360 --> 00:36:23.490 This is maybe State. 00:36:23.490 --> 00:36:27.260 This is Zip. 00:36:27.260 --> 00:36:29.030 This is maybe Country. 00:36:29.030 --> 00:36:30.920 And this is maybe total. 00:36:30.920 --> 00:36:34.670 But even here there are some opportunities for disagreement. 00:36:34.670 --> 00:36:38.400 This is a little US centric, the fact that we have cities and states, 00:36:38.400 --> 00:36:39.704 as well as zip codes. 00:36:39.704 --> 00:36:41.870 Indeed it might be the case that zip codes don't all 00:36:41.870 --> 00:36:44.900 follow the same format indeed even in the US sometimes people write them 00:36:44.900 --> 00:36:47.990 with five digits, sometimes with nine digits and a hyphen, 00:36:47.990 --> 00:36:49.910 so there's a design opportunity there. 00:36:49.910 --> 00:36:53.360 But let's drill in deeper as to what data type these various fields 00:36:53.360 --> 00:36:56.270 should be at least right now. 00:36:56.270 --> 00:37:00.050 Let me make room at the top here so we can just make notes 00:37:00.050 --> 00:37:02.060 as to the data types but in reality these 00:37:02.060 --> 00:37:06.900 would be stored not in the table itself but somewhere else in the database. 00:37:06.900 --> 00:37:09.230 What data type should product be? 00:37:09.230 --> 00:37:15.690 And remember that among our options are data types like these product. 00:37:15.690 --> 00:37:16.560 It's not a number. 00:37:16.560 --> 00:37:18.320 So we can knock off most of these options. 00:37:18.320 --> 00:37:22.340 It's definitely not a date, time, or timestamp, so then it boils down to 00:37:22.340 --> 00:37:24.320 is it a char or a varchar? 00:37:27.960 --> 00:37:31.665 So char is a fixed-length field, so we have to decide in advance 00:37:31.665 --> 00:37:36.300 is it going to be eight characters, 16 characters, 100 characters or something 00:37:36.300 --> 00:37:37.330 else. 00:37:37.330 --> 00:37:40.360 Varchar would mean we just know the upper bound. 00:37:40.360 --> 00:37:42.180 So I don't know. 00:37:42.180 --> 00:37:48.750 W-i-d-g-e-t is 6, so minimally it's got to be six characters but then 00:37:48.750 --> 00:37:50.750 there's sprocket, s-p-r-o-c-k-e-t. 00:37:53.710 --> 00:37:55.920 Not sure if I've ever had to spell that word. 00:37:55.920 --> 00:37:58.720 That's eight characters so six isn't going to cut it. 00:37:58.720 --> 00:38:02.070 So maybe we do something like char8. 00:38:02.070 --> 00:38:03.600 But there's a tradeoff here. 00:38:03.600 --> 00:38:10.500 If I specify now that this field is maximally going to be eight characters, 00:38:10.500 --> 00:38:15.540 then I can't sell anything with a longer name than sprocket. 00:38:15.540 --> 00:38:18.079 I could change the database size-- or the column size 00:38:18.079 --> 00:38:21.120 later on but it's ideal to get these things right from the get go and not 00:38:21.120 --> 00:38:24.390 have to go back in and change your infrastructure or hire someone 00:38:24.390 --> 00:38:26.340 to come in and make modifications. 00:38:26.340 --> 00:38:28.230 So maybe that's a little shortsighted. 00:38:28.230 --> 00:38:29.730 Maybe it shouldn't be eight. 00:38:29.730 --> 00:38:33.810 Maybe it should be twice that, like 16. 00:38:33.810 --> 00:38:34.949 I don't know, and I don't-- 00:38:34.949 --> 00:38:36.990 I'm not necessarily going to offer an answer here 00:38:36.990 --> 00:38:39.656 because it entirely depends on what data you're trying to store, 00:38:39.656 --> 00:38:41.180 what items you're trying to sell. 00:38:41.180 --> 00:38:44.100 This, in fact, now is even more wasteful because even though I'm now 00:38:44.100 --> 00:38:47.640 anticipating product names that are up to 16 characters, 00:38:47.640 --> 00:38:53.050 the char data type is going to use for every product name 16 characters, 00:38:53.050 --> 00:38:56.640 even if a whole bunch of those are blank because the word isn't long enough 00:38:56.640 --> 00:38:58.650 to need 16 characters. 00:38:58.650 --> 00:39:03.600 So maybe I would go for a variable length field not char but varchar 00:39:03.600 --> 00:39:08.280 whereby the maximum length of my column should be 16 characters. 00:39:08.280 --> 00:39:11.370 But here as in CS more generally tradeoff 00:39:11.370 --> 00:39:13.500 like it might seem like a win like OK wolf 00:39:13.500 --> 00:39:15.970 the problem is I'm using too much space all the time, 00:39:15.970 --> 00:39:17.710 Let me just put an upper bound. 00:39:17.710 --> 00:39:20.700 There's gotta be some price you pay there's got to be a tradeoff 00:39:20.700 --> 00:39:21.900 and indeed there is. 00:39:21.900 --> 00:39:24.510 It turns out that a database can generally 00:39:24.510 --> 00:39:28.140 search your data more quickly if it knows 00:39:28.140 --> 00:39:30.420 the entire column is the same width. 00:39:30.420 --> 00:39:34.462 Long story short, if it knows that this column has eight characters, eight 00:39:34.462 --> 00:39:36.420 characters, eight characters, eight characters, 00:39:36.420 --> 00:39:40.080 it can use very simple arithmetic to jump mathematically 00:39:40.080 --> 00:39:44.760 from one row in that column to another because they're all the same distance 00:39:44.760 --> 00:39:46.050 apart essentially. 00:39:46.050 --> 00:39:49.650 But if you have a varchar column and variable length, 00:39:49.650 --> 00:39:53.220 you can think of the column not as being perfectly smooth on both sides 00:39:53.220 --> 00:39:55.940 but kind of jagged on one side. 00:39:55.940 --> 00:39:57.030 Some rows are short. 00:39:57.030 --> 00:39:57.990 Some rows are long. 00:39:57.990 --> 00:40:00.960 And so you can't just blindly use simple arithmetic 00:40:00.960 --> 00:40:05.700 and jump eight characters at a time if the length were eight or 16 characters 00:40:05.700 --> 00:40:07.830 at a time if the length were 16. 00:40:07.830 --> 00:40:09.000 So it's a tradeoff. 00:40:09.000 --> 00:40:12.980 If we want to be able to search through our product names quickly, 00:40:12.980 --> 00:40:14.520 might not want to use a varchar. 00:40:14.520 --> 00:40:16.860 So here, too, no right answer. 00:40:16.860 --> 00:40:18.300 It's a tradeoff. 00:40:18.300 --> 00:40:20.210 And it might not matter for small datasets. 00:40:20.210 --> 00:40:21.210 Indeed probably doesn't. 00:40:21.210 --> 00:40:24.376 If you don't have many customers, you don't have many products but certainly 00:40:24.376 --> 00:40:26.720 and scale these kinds of things matter. 00:40:26.720 --> 00:40:30.060 And even when building something that's not going to have that many users, 00:40:30.060 --> 00:40:32.860 just getting it right doesn't cost that much upfront time. 00:40:32.860 --> 00:40:36.360 The most important thing, I dare say, is to actually give it some thought 00:40:36.360 --> 00:40:39.780 and not just leave it to chance or take the easiest way out because invariably 00:40:39.780 --> 00:40:42.960 over time, you will build up so-called technical debt, 00:40:42.960 --> 00:40:45.810 where you make poor decision, poor decision, poor decision, and now 00:40:45.810 --> 00:40:49.770 you have a very expensive decision later on if you have to go back and change 00:40:49.770 --> 00:40:51.540 a lot of those things. 00:40:51.540 --> 00:40:52.800 What about quantity? 00:40:52.800 --> 00:40:56.580 Well, quantity, nicely enough, would seem to fall more cleanly 00:40:56.580 --> 00:40:58.080 into one of these fields. 00:40:58.080 --> 00:40:59.250 Now, is that an integer? 00:40:59.250 --> 00:41:00.270 Is that a big integer? 00:41:00.270 --> 00:41:05.370 I think we're doing pretty well if we need more than 2 billion products sold. 00:41:05.370 --> 00:41:07.380 Maybe we're sort of a smaller shop, and we 00:41:07.380 --> 00:41:10.145 can get away with an integer or even a small int, 00:41:10.145 --> 00:41:11.520 but this too would be a tradeoff. 00:41:11.520 --> 00:41:12.970 How many bits do you want to spend? 00:41:12.970 --> 00:41:16.136 I would say that the default typically would be an integer unless you really 00:41:16.136 --> 00:41:20.020 are expecting a huge amount of data, billions of rows. 00:41:20.020 --> 00:41:23.805 So we might say something like integer here, name. 00:41:23.805 --> 00:41:25.800 Oh, gosh, this is that can of worms again. 00:41:25.800 --> 00:41:27.270 How long is a maximum name? 00:41:27.270 --> 00:41:31.290 Maybe I do some googling and some due diligence as the maximum length names. 00:41:31.290 --> 00:41:32.769 Maybe I just want to cut it off. 00:41:32.769 --> 00:41:35.310 You've probably been to a website before where you're happily 00:41:35.310 --> 00:41:39.810 entering your information, and then you keep typing and nothing is happening. 00:41:39.810 --> 00:41:42.870 And that's because the programmer, or the database designer, 00:41:42.870 --> 00:41:45.315 has decided your name doesn't need to be that long. 00:41:45.315 --> 00:41:47.190 Or your address doesn't need to be that long. 00:41:47.190 --> 00:41:49.260 And it's infuriating sometimes because there 00:41:49.260 --> 00:41:52.980 are assumptions, naive, insensitive assumptions sometimes, 00:41:52.980 --> 00:41:57.420 but that boiled down to perhaps either calculated design decisions or maybe 00:41:57.420 --> 00:42:00.180 just poor design decisions. 00:42:00.180 --> 00:42:01.890 So I don't know what is right here. 00:42:01.890 --> 00:42:06.180 16 feels a little too conservative, so maybe I 00:42:06.180 --> 00:42:08.890 would say something like varchar 128. 00:42:08.890 --> 00:42:11.640 But even that I'd probably want to take a look at my customer base 00:42:11.640 --> 00:42:15.360 and see if that's well beyond the limit of what I might actually need. 00:42:15.360 --> 00:42:16.380 Same thing for street. 00:42:16.380 --> 00:42:17.340 Same thing for city. 00:42:17.340 --> 00:42:19.048 I don't really know what the right length 00:42:19.048 --> 00:42:23.380 is, but let's assume it's going to be varchars for those. 00:42:23.380 --> 00:42:28.070 So we'll just use a dot, dot, dot to suggest that it's an open question. 00:42:28.070 --> 00:42:29.890 State is an interesting one. 00:42:29.890 --> 00:42:35.080 If we expect to have only US customers, we can do a little optimization here. 00:42:35.080 --> 00:42:38.630 If every US state has a two character abbreviation, 00:42:38.630 --> 00:42:42.730 we could do char2 so that we get that performance 00:42:42.730 --> 00:42:47.530 benefit of knowing that every row is the same width, two characters, 00:42:47.530 --> 00:42:50.920 so long as we're comfortable not selling products to anyone else in the world 00:42:50.920 --> 00:42:54.760 beyond the United States zip code 2. 00:42:54.760 --> 00:42:56.480 Design opportunity there. 00:42:56.480 --> 00:43:01.810 I think it's fair to say that integer, while seemingly correct, 00:43:01.810 --> 00:43:04.870 might get you into some trouble, at least here in Massachusetts, 00:43:04.870 --> 00:43:09.790 where we have a whole bunch of zip codes that start with zero. 00:43:09.790 --> 00:43:11.860 Like in the world of numbers and integers, 00:43:11.860 --> 00:43:13.970 leading zeros are meaningless. 00:43:13.970 --> 00:43:16.240 You can have as many zeros to the left of your number 00:43:16.240 --> 00:43:20.090 and they don't change the actual value of your number. 00:43:20.090 --> 00:43:22.700 But in a zip code, it does have meaning. 00:43:22.700 --> 00:43:25.150 It is the first of five digits here, and so 00:43:25.150 --> 00:43:29.290 calling this an integer probably isn't very wise because if the database is 00:43:29.290 --> 00:43:30.550 like most humans. 00:43:30.550 --> 00:43:32.780 The database might ignore that first digit, 00:43:32.780 --> 00:43:38.230 and so my zip code is going to appear to be 2138, which really isn't right. 00:43:38.230 --> 00:43:39.559 Now, we could fix that in code. 00:43:39.559 --> 00:43:42.100 We can make sure that, well, if we ever see a zip code that's 00:43:42.100 --> 00:43:45.935 only four or fewer digits, this let's pre-pin some zeros, that feels messy. 00:43:45.935 --> 00:43:48.310 If we're going to put that data in there from the get go, 00:43:48.310 --> 00:43:51.400 let's make sure it comes back to us correctly. 00:43:51.400 --> 00:43:54.020 And so I might actually say something here. 00:43:54.020 --> 00:43:56.440 Even though it looks like a number, maybe I 00:43:56.440 --> 00:43:59.380 would actually say it's a char5 field, or maybe it's 00:43:59.380 --> 00:44:03.850 nine or 10 if I want to have a hyphen in there for US zip codes. 00:44:03.850 --> 00:44:05.230 Country, too. 00:44:05.230 --> 00:44:10.810 Here maybe it's-- going to be a three-character abbreviation 00:44:10.810 --> 00:44:13.930 of two-character abbreviation, not sure what's best there. 00:44:17.290 --> 00:44:19.210 Really depends, too, on what countries want 00:44:19.210 --> 00:44:23.560 to sell to if not just the US perhaps, so there's a design opportunity there. 00:44:23.560 --> 00:44:26.680 And then perhaps the last to consider is this total. 00:44:26.680 --> 00:44:30.280 I think it's fair to say that integer would not be correct 00:44:30.280 --> 00:44:33.430 because we would either be rounding down or rounding up 00:44:33.430 --> 00:44:36.560 all of the money we're supposed to be collecting from our customers. 00:44:36.560 --> 00:44:38.880 So we probably want one of these. 00:44:38.880 --> 00:44:43.637 And some databases differ, but generally a data type like Decimal is ideal. 00:44:43.637 --> 00:44:45.970 You don't want to even get into the business of worrying 00:44:45.970 --> 00:44:49.000 about these rounding errors or errors of imprecision 00:44:49.000 --> 00:44:51.280 as in Superman 3 and Office Space. 00:44:51.280 --> 00:44:54.677 Much better to just say that you want a fixed number of digits 00:44:54.677 --> 00:44:57.760 to the left and a fixed number of digits to the right of the decimal place 00:44:57.760 --> 00:45:01.000 so that you are not losing even fractions of pennies 00:45:01.000 --> 00:45:04.300 or mischarging anyone or losing out in any way. 00:45:04.300 --> 00:45:06.130 So we might use Decimal in that way. 00:45:06.130 --> 00:45:08.350 Some databases, though, have an actual currency data 00:45:08.350 --> 00:45:12.070 type, which operates similarly. 00:45:12.070 --> 00:45:20.860 So there remains to be seen some opportunities for improvement. 00:45:20.860 --> 00:45:24.730 If I continue to sell widgets, let alone sprockets, 00:45:24.730 --> 00:45:28.090 I'm going to have more and more and more rows in this table. 00:45:28.090 --> 00:45:31.140 And if Rob and Zamyla end up being repeat customers, 00:45:31.140 --> 00:45:36.240 I might have more and more Robs and more and more Zamylas in the same table. 00:45:36.240 --> 00:45:41.770 And as that happens, there begins to be quite a bit of redundancy. 00:45:41.770 --> 00:45:45.710 Indeed, what can you factor out over time? 00:45:45.710 --> 00:45:49.780 Well, certainly if Zamyla and Rob keep ordering more and more items 00:45:49.780 --> 00:45:55.076 from my database, I could just keep updating the quantity, 00:45:55.076 --> 00:45:56.325 but that feels a little messy. 00:45:56.325 --> 00:45:59.980 It'd be nice to have a veritable history of all of my sales. 00:45:59.980 --> 00:46:01.990 I don't want to just aggregate everything. 00:46:01.990 --> 00:46:06.970 So adding more and more rows for every sale seems pretty compelling, 00:46:06.970 --> 00:46:09.790 but then I'm going to see Zamyla Chan and Rob 00:46:09.790 --> 00:46:15.010 Bouden again and again and again and again and again in this table. 00:46:15.010 --> 00:46:18.940 And I'm also going to see their address again and again and again and again. 00:46:18.940 --> 00:46:24.520 And herein lies now the capabilities and of the feature 00:46:24.520 --> 00:46:25.810 of a relational database. 00:46:25.810 --> 00:46:29.080 You know what I'm going to do rather than just treat this as my one 00:46:29.080 --> 00:46:32.380 and only table, let me go ahead and just rename this sheet 00:46:32.380 --> 00:46:34.960 or worksheet to be Orders. 00:46:34.960 --> 00:46:36.590 I could call it anything I want. 00:46:36.590 --> 00:46:37.340 And you know what? 00:46:37.340 --> 00:46:42.530 Let me create another table or sheet, and let me call this Customers. 00:46:42.530 --> 00:46:44.680 So even though, again, I'm using Excel here, 00:46:44.680 --> 00:46:49.630 this is just like I might be doing in Oracle or a SQL Server 00:46:49.630 --> 00:46:53.500 or in Postgres or mySQL or the like, I've just created a second table. 00:46:53.500 --> 00:46:55.630 But as per the name relational database, there's 00:46:55.630 --> 00:46:59.340 going to be a relation across these two tables now. 00:46:59.340 --> 00:47:00.970 And what's that relation going to be? 00:47:00.970 --> 00:47:01.803 Well, you know what? 00:47:01.803 --> 00:47:06.760 I'm going to go ahead and copy all of this customer data 00:47:06.760 --> 00:47:12.790 and actually cut it and paste it over into this new table called Customers. 00:47:12.790 --> 00:47:15.010 And now this isn't quite sufficient. 00:47:15.010 --> 00:47:18.520 I'm going to go ahead and notice that Excel has already 00:47:18.520 --> 00:47:20.230 numbered these things for me. 00:47:20.230 --> 00:47:23.710 But I'm going to go ahead just for clarity and add my own column, 00:47:23.710 --> 00:47:25.992 and I'm going to call this ID. 00:47:25.992 --> 00:47:27.575 And it's going to be, say, an integer. 00:47:29.430 --> 00:47:34.230 And I'm going to cause Zamyla my first customer, Rob my second customer, 00:47:34.230 --> 00:47:38.350 and in this case, notice now these unique identifiers are part of my data. 00:47:38.350 --> 00:47:41.940 It's not just part of Excel's arbitrary numbering on the left and arbitrary 00:47:41.940 --> 00:47:43.530 lettering on the top. 00:47:43.530 --> 00:47:47.370 Rather these are now actual pieces of data in my database 00:47:47.370 --> 00:47:50.430 that will be stored and backed up and so forth. 00:47:50.430 --> 00:47:54.150 But notice now that Zamyla is customer number 1 00:47:54.150 --> 00:47:59.130 and Rob is customer number 2, each of whom lives at these addresses, 00:47:59.130 --> 00:48:03.690 I don't have to worry now about redundantly storing that data because 00:48:03.690 --> 00:48:07.380 in my orders table now, any time Rob or Zamyla 00:48:07.380 --> 00:48:10.740 or some other customer purchase from my website-- 00:48:10.740 --> 00:48:13.080 notice I can shrink this. 00:48:13.080 --> 00:48:15.810 And I can say, you know what? 00:48:15.810 --> 00:48:19.470 This is the customer who bought this. 00:48:19.470 --> 00:48:21.510 It's going to be an integer. 00:48:21.510 --> 00:48:23.430 And you know who bought that first widget? 00:48:23.430 --> 00:48:24.720 Well, it was Zamyla. 00:48:24.720 --> 00:48:27.600 And you know who bought that second widget and the third widget, 00:48:27.600 --> 00:48:30.060 too, since quantity was 2 was Rob. 00:48:30.060 --> 00:48:33.100 And if some new customer comes into my database-- 00:48:33.100 --> 00:48:36.690 so, for instance, suppose that someone new orders from my website, 00:48:36.690 --> 00:48:38.820 they are going to become customer number 3. 00:48:38.820 --> 00:48:41.010 That will be, for instance, Doug Lloyd, and suppose 00:48:41.010 --> 00:48:45.360 he, too, is at that same address at that same zip code in the USA. 00:48:45.360 --> 00:48:49.080 But now in my orders table, suppose that Doug has bought 10 widgets. 00:48:49.080 --> 00:48:50.430 He really went all in. 00:48:50.430 --> 00:48:54.600 Well, he, too, is going to have widget there, quantity 10, 00:48:54.600 --> 00:49:04.260 his customer ID is 3, and he, of course, is going to have spent $99.90 with us 00:49:04.260 --> 00:49:05.220 in total. 00:49:05.220 --> 00:49:08.460 So notice how we've factored out the common information 00:49:08.460 --> 00:49:10.080 to eliminate a redundancy. 00:49:10.080 --> 00:49:16.159 Notice now that if Doug or Rob or Zamyla move addresses or change their address, 00:49:16.159 --> 00:49:19.200 or if we were storing more information, like their phone number and email 00:49:19.200 --> 00:49:23.470 address and other personal data, too, we could change it in just one place. 00:49:23.470 --> 00:49:26.970 And not in our orders table because, indeed, there's now distinct semantics. 00:49:26.970 --> 00:49:29.010 Our orders table stores orders. 00:49:29.010 --> 00:49:31.750 Our customers table stores customers. 00:49:31.750 --> 00:49:34.530 And if we wanted yet another table, as we probably should have, 00:49:34.530 --> 00:49:36.630 it could store, say, products. 00:49:36.630 --> 00:49:38.520 In fact, there's still this redundancy. 00:49:38.520 --> 00:49:41.130 Let's go ahead and create another table called Products inside 00:49:41.130 --> 00:49:44.370 of which is an ID field as well as a product field, 00:49:44.370 --> 00:49:47.760 and then, just as before, let's start numbering our IDs from 1. 00:49:47.760 --> 00:49:51.600 So our first product is a widget and while we've not sold any yet, 00:49:51.600 --> 00:49:55.230 our second product, ID 2, is a sprocket. 00:49:55.230 --> 00:49:59.020 Now, in this way in my orders table, can I store not a product per se, 00:49:59.020 --> 00:50:00.720 but a product ID. 00:50:00.720 --> 00:50:04.260 And so now even though my table is frankly 00:50:04.260 --> 00:50:06.750 becoming more and more cryptic and a little harder 00:50:06.750 --> 00:50:08.070 for me to wrap my mind around-- 00:50:08.070 --> 00:50:10.200 what am I looking at, it's all just numbers, 00:50:10.200 --> 00:50:13.560 it is now what we would call normalized in the context of a database. 00:50:13.560 --> 00:50:17.160 And a database typically is not meant to be looked at by human eyes just 00:50:17.160 --> 00:50:17.700 like this. 00:50:17.700 --> 00:50:21.625 Rather it's meant to be queried and data created and updated and deleted. 00:50:21.625 --> 00:50:24.000 And so there are certain commands in this language called 00:50:24.000 --> 00:50:27.420 SQL that actually facilitate programmatically, 00:50:27.420 --> 00:50:31.890 using a programming language, what I've been doing with my keyboard and fingers 00:50:31.890 --> 00:50:32.640 alone. 00:50:32.640 --> 00:50:35.190 Indeed, the commands with which you can manipulate the data 00:50:35.190 --> 00:50:38.580 in the database itself is going to be SQL's commands, 00:50:38.580 --> 00:50:41.740 create, select, update, delete, and others. 00:50:41.740 --> 00:50:44.490 Indeed, much like Scratch has the various puzzle pieces 00:50:44.490 --> 00:50:47.280 via which you can implement logic in a Scratch-based program, 00:50:47.280 --> 00:50:50.400 so does SQL will have these puzzle pieces, if you will, 00:50:50.400 --> 00:50:54.180 via which you can create and select and update and delete data 00:50:54.180 --> 00:50:57.780 from your database just like I've been simulating by using Excel here 00:50:57.780 --> 00:50:58.950 and my keyboard. 00:50:58.950 --> 00:51:02.070 And indeed, some of these more sophisticated concepts, 00:51:02.070 --> 00:51:05.220 like primary key and foreign key and unique key, 00:51:05.220 --> 00:51:08.640 now rather start to jump out at us because if we consider what my orders 00:51:08.640 --> 00:51:14.470 table now looks like, notice that it's indeed mostly numbers, 00:51:14.470 --> 00:51:18.550 but those numbers are essentially keys into another table. 00:51:18.550 --> 00:51:21.660 In fact, if you look at products, my products table 00:51:21.660 --> 00:51:25.510 has an ID column, which has unique numbers 1 2, 00:51:25.510 --> 00:51:29.550 and so forth my customers table has its own ID column. 00:51:29.550 --> 00:51:31.980 And these are same numbers, but different meaning. 00:51:31.980 --> 00:51:36.400 These are customer numbers 1, 2, 3, and so forth. 00:51:36.400 --> 00:51:40.290 So in each of these tables customers and in products 00:51:40.290 --> 00:51:46.230 is that ID column a primary key for the customers and products table 00:51:46.230 --> 00:51:47.160 respectively. 00:51:47.160 --> 00:51:50.460 Within each of those tables, it is that ID column 00:51:50.460 --> 00:51:53.530 that uniquely identifies rows. 00:51:53.530 --> 00:51:57.030 Zamyla is and shall always be customer number one. 00:51:57.030 --> 00:51:59.640 Rob is and shall always be customer number 2. 00:51:59.640 --> 00:52:02.050 Doug is and shall always be customer number 3. 00:52:02.050 --> 00:52:06.310 So those IDs those primary keys must not change. 00:52:06.310 --> 00:52:08.460 They must be invariant, and as such they can 00:52:08.460 --> 00:52:11.700 be reliably used to uniquely identify customers 00:52:11.700 --> 00:52:14.970 or, in the context of products, uniquely identify 00:52:14.970 --> 00:52:18.030 a product or, in the case of orders-- 00:52:18.030 --> 00:52:19.410 we forgot something. 00:52:19.410 --> 00:52:22.260 It would seem valuable if we continue this train of thought 00:52:22.260 --> 00:52:30.610 to also have here in my orders table an order ID that should probably represent 00:52:30.610 --> 00:52:35.110 each of these orders, which is just going to similarly be an integer that 00:52:35.110 --> 00:52:38.650 just keeps track really of how many total orders have been placed, 00:52:38.650 --> 00:52:45.040 1, 2, 3, 4, 5, 6 on up, all the way up to 2 billion or best yet even higher 00:52:45.040 --> 00:52:45.830 than that. 00:52:45.830 --> 00:52:47.560 But notice these other numbers now. 00:52:47.560 --> 00:52:51.610 The product column is no longer the name widget or sprocket. 00:52:51.610 --> 00:52:53.770 The quantity column, still just an integer. 00:52:53.770 --> 00:52:56.860 That's not anything to do with a key even though it's also an integer 00:52:56.860 --> 00:52:59.440 but customer is an ID. 00:52:59.440 --> 00:53:03.280 But it's not a primary key, nor is product a primary key here. 00:53:03.280 --> 00:53:07.450 In this context of my orders table is product 00:53:07.450 --> 00:53:11.710 and is customer a foreign key because those two columns are 00:53:11.710 --> 00:53:14.870 primary keys in two other tables. 00:53:14.870 --> 00:53:17.710 So within one table if you have an ID, it 00:53:17.710 --> 00:53:19.750 should be generally considered your primary key 00:53:19.750 --> 00:53:22.540 if that is the role it's playing, uniquely identifying your rows. 00:53:22.540 --> 00:53:25.660 But if that same number appears in some other table 00:53:25.660 --> 00:53:30.370 for the purpose of cross-referencing really, is it a foreign key? 00:53:30.370 --> 00:53:34.480 And suffice it to say that in SQL, this database language, 00:53:34.480 --> 00:53:38.230 even though this looks cryptic to us humans, realize that with SQL 00:53:38.230 --> 00:53:42.790 can you stitch these distinct tables or sheets back together. 00:53:42.790 --> 00:53:46.690 You can quote unquote join SQL tables in such a way 00:53:46.690 --> 00:53:49.780 that you can take your customers table and your orders table 00:53:49.780 --> 00:53:54.030 and reassemble them so that you see next to each order, 00:53:54.030 --> 00:53:56.290 say, on your administrative web page that 00:53:56.290 --> 00:53:58.450 allows you to see all of your recent orders, 00:53:58.450 --> 00:54:01.947 not the customer IDs of who has bought what but actually 00:54:01.947 --> 00:54:04.780 the customer names and their addresses and maybe their phone numbers 00:54:04.780 --> 00:54:05.937 and e-mails and more. 00:54:05.937 --> 00:54:08.020 You can join this information back together again. 00:54:08.020 --> 00:54:12.400 And what databases are good at is doing exactly that kind of joining, 00:54:12.400 --> 00:54:15.620 not to mention searching or more. 00:54:15.620 --> 00:54:19.300 But sheesh, this was a lot of work just to get to this point right? 00:54:19.300 --> 00:54:23.140 It was pretty easy to make one worksheet just put all of my orders in there. 00:54:23.140 --> 00:54:26.751 But then we went down this slope of oh, well, maybe we should factor this out. 00:54:26.751 --> 00:54:27.250 Oh, wait. 00:54:27.250 --> 00:54:28.208 We can factor this out. 00:54:28.208 --> 00:54:29.860 Oh, maybe we should add some IDs here. 00:54:29.860 --> 00:54:32.540 We just created a whole lot of work for ourselves. 00:54:32.540 --> 00:54:36.340 Now, I dare say it will pay off over the long run, 00:54:36.340 --> 00:54:39.130 and indeed our database will be much better designed 00:54:39.130 --> 00:54:42.370 where better design will lead to faster performance, less 00:54:42.370 --> 00:54:46.360 redundant storage of data, and more, but it certainly took a lot of work. 00:54:46.360 --> 00:54:49.090 So it turns out there is the opposite of a SQL database 00:54:49.090 --> 00:54:53.020 that's been in vogue for some time called a noSQL database, 00:54:53.020 --> 00:54:56.710 or a document store, an object-oriented database where 00:54:56.710 --> 00:55:00.340 the defining characteristic really is that it is not SQL. 00:55:00.340 --> 00:55:03.100 It does not store data in rows and columns. 00:55:03.100 --> 00:55:06.880 It does not store data in one or more tables that can then be joined. 00:55:06.880 --> 00:55:10.810 Rather it stores all of your data really all together 00:55:10.810 --> 00:55:12.340 in a hierarchical structure. 00:55:12.340 --> 00:55:15.410 And that's an oversimplification because there are other features. 00:55:15.410 --> 00:55:17.500 But consider this example here. 00:55:17.500 --> 00:55:21.460 This is written in essentially a format that's called JSON, JavaScript Object 00:55:21.460 --> 00:55:23.920 Notation, but this idea of a noSQL database 00:55:23.920 --> 00:55:27.490 has no fundamental connection to JavaScript the language. 00:55:27.490 --> 00:55:31.660 Just so happens this tends to be the language with which these data 00:55:31.660 --> 00:55:33.430 structures are represented. 00:55:33.430 --> 00:55:37.060 The curly brace here and here just means here is an object of information. 00:55:37.060 --> 00:55:39.610 The quotes are just used around words and numbers, 00:55:39.610 --> 00:55:43.440 and the colon separate keys from values where keys and values is 00:55:43.440 --> 00:55:46.270 a very common paradigm where on the left is metadata 00:55:46.270 --> 00:55:50.170 and on the right is data typically, key and value respectively. 00:55:50.170 --> 00:55:52.060 Square brackets just mean an array, which 00:55:52.060 --> 00:55:55.020 means that this is an array or a list of two values, 00:55:55.020 --> 00:55:59.590 something comma something, which happens to be GPS coordinates, latitude 00:55:59.590 --> 00:56:00.820 and longitude here. 00:56:00.820 --> 00:56:01.690 So what is this? 00:56:01.690 --> 00:56:02.860 What are we looking at? 00:56:02.860 --> 00:56:05.680 This appears to be an object, shall we say, 00:56:05.680 --> 00:56:09.220 that represents the city of Austin where Harvard's business 00:56:09.220 --> 00:56:11.980 school is, where Harvard's engineering school will soon be. 00:56:11.980 --> 00:56:16.030 And so this object contains a bit of hierarchical information, 00:56:16.030 --> 00:56:18.760 not a huge amount, but notice it has an ID, 00:56:18.760 --> 00:56:22.330 which happens to be its zip code 02134. 00:56:22.330 --> 00:56:24.490 It has a city name, Austin. 00:56:24.490 --> 00:56:29.770 Has a location which by convention is a comma-separated list of two values, 00:56:29.770 --> 00:56:33.470 latitude and longitude, and so that's kind of some hierarchy. 00:56:33.470 --> 00:56:35.260 It's not just a simple value. 00:56:35.260 --> 00:56:40.750 And then there's a population of 23,775 at last count though surely to rise. 00:56:40.750 --> 00:56:43.100 And then in the state of Massachusetts. 00:56:43.100 --> 00:56:45.790 So this is actually a snippet from a database called 00:56:45.790 --> 00:56:50.830 MongoDB, which is a very popular noSQL database that stores data essentially 00:56:50.830 --> 00:56:51.740 like this. 00:56:51.740 --> 00:56:55.060 So rather than flatten all of your data as is 00:56:55.060 --> 00:56:57.580 the case in a relational database using SQL 00:56:57.580 --> 00:57:01.720 an object-oriented database or a document store like this noSQL database 00:57:01.720 --> 00:57:06.070 called MongoDB, really stores things as key value pairs. 00:57:06.070 --> 00:57:10.280 And those key value pairs might actually have some hierarchical structure. 00:57:10.280 --> 00:57:14.800 So if you, for instance, stored an order like we just did, 00:57:14.800 --> 00:57:16.750 instead of storing it in rows and columns, 00:57:16.750 --> 00:57:21.580 you would just store it is one big chunk of information like this. 00:57:21.580 --> 00:57:25.390 And inside of that object, an order object 00:57:25.390 --> 00:57:27.730 might actually be the entire customer. 00:57:27.730 --> 00:57:35.400 Inside of that customer might be his or her city and state and so forth. 00:57:35.400 --> 00:57:41.000 So there might actually be retained in some hierarchy like you see here. 00:57:41.000 --> 00:57:43.850 And so this is just a different way of viewing the world. 00:57:43.850 --> 00:57:47.892 It has typically been a more efficient way of viewing 00:57:47.892 --> 00:57:49.600 and modeling your world because you don't 00:57:49.600 --> 00:57:53.350 have to give frankly as much thought to the design and the division of some 00:57:53.350 --> 00:57:55.460 of your data and the normalization thereof, 00:57:55.460 --> 00:57:58.060 but you do sometimes pay a performance penalty. 00:57:58.060 --> 00:58:01.330 You do sometimes pay a penalty and redundancy of data, 00:58:01.330 --> 00:58:05.680 though there are ways to avoid that by reusing something like that ID field. 00:58:05.680 --> 00:58:08.800 So it really is ultimately a different philosophy right now. 00:58:08.800 --> 00:58:11.920 And its a competing alternative to something like a relational database, 00:58:11.920 --> 00:58:13.900 and here, too, will there be an opportunity 00:58:13.900 --> 00:58:16.810 to read up on and to debate exactly what is 00:58:16.810 --> 00:58:20.110 best for your actual problem at hand. 00:58:20.110 --> 00:58:21.010 And now mobile. 00:58:21.010 --> 00:58:23.170 Up until now we focused on the front end, 00:58:23.170 --> 00:58:26.740 on the back end of really web-based applications 00:58:26.740 --> 00:58:30.590 that you might access on a laptop or desktop or even a mobile device. 00:58:30.590 --> 00:58:33.610 But what we haven't given thought to is the design opportunities 00:58:33.610 --> 00:58:35.890 for mobile devices specifically. 00:58:35.890 --> 00:58:39.630 Indeed, most any of you who have a smartphone these days, 00:58:39.630 --> 00:58:41.680 iPhone, Android, or the like, have probably 00:58:41.680 --> 00:58:45.580 downloaded some application that did not come with your phone. 00:58:45.580 --> 00:58:49.270 And you downloaded that from the Google Play Store or the Apple App Store 00:58:49.270 --> 00:58:54.950 and that software is quite likely written in a very specific language. 00:58:54.950 --> 00:58:56.980 Indeed, the language for Android is typically 00:58:56.980 --> 00:58:58.900 Java in which programs are written. 00:58:58.900 --> 00:59:02.630 The languages in which iPhone and iPad applications are written 00:59:02.630 --> 00:59:07.030 is Objective-C or more recently Swift, and so there, too, 00:59:07.030 --> 00:59:09.070 at least in the world of iPhones and iPads, 00:59:09.070 --> 00:59:11.584 do you have design discretion over what language 00:59:11.584 --> 00:59:13.750 you use with Swift being the more modern and the one 00:59:13.750 --> 00:59:15.400 that Apple's really been pushing. 00:59:15.400 --> 00:59:21.180 But even then, do you have the option to not implement a native application per 00:59:21.180 --> 00:59:26.280 se, one that is implemented in Java or in Objective-C or Swift, all of which 00:59:26.280 --> 00:59:28.710 are programming languages, you can actually 00:59:28.710 --> 00:59:33.240 implement a web-based application but package it up 00:59:33.240 --> 00:59:37.230 in a way that makes it seem like it's a native application, 00:59:37.230 --> 00:59:41.160 allows you to distribute it via the App Store, via the Google Play Store, 00:59:41.160 --> 00:59:45.900 so that it ends up putting an icon on your customers phones. 00:59:45.900 --> 00:59:50.430 But when they click it, they're not seeing an iPhone application per se 00:59:50.430 --> 00:59:52.510 or an Android application per se. 00:59:52.510 --> 00:59:55.890 They are seeing really a secretly embedded web 00:59:55.890 --> 00:59:58.920 browser whereby your application is implemented 00:59:58.920 --> 01:00:02.490 at the end of the day in JavaScript and HTML and CSS, 01:00:02.490 --> 01:00:05.250 but it's got a nice little rectangular window around it, 01:00:05.250 --> 01:00:09.420 so the users don't realize that they're looking at Safari or Chrome 01:00:09.420 --> 01:00:13.200 because all of the menus of those browsers have been stripped away. 01:00:13.200 --> 01:00:15.600 All you get is an embedded web browser. 01:00:15.600 --> 01:00:19.860 And so here, do you have an opportunity to choose among these options. 01:00:19.860 --> 01:00:22.620 And so one of the design decisions one makes 01:00:22.620 --> 01:00:28.260 when designing for a mobile user base is do we develop an iAndroid application? 01:00:28.260 --> 01:00:31.260 Do we develop an iPhone or iPad application? 01:00:31.260 --> 01:00:34.560 Do we do both environments still? 01:00:34.560 --> 01:00:36.090 And how do you choose among those? 01:00:36.090 --> 01:00:38.460 Well, it certainly depends on your demographic. 01:00:38.460 --> 01:00:42.030 Android is by far the most popular mobile operating system these days. 01:00:42.030 --> 01:00:45.060 But in certain contexts, a campus like this, iPhones 01:00:45.060 --> 01:00:47.040 are actually even more popular. 01:00:47.040 --> 01:00:51.550 So do you want to cater to one demographic or another or ideally both. 01:00:51.550 --> 01:00:55.770 Both is probably your instinctive answer, but that comes with a tradeoff. 01:00:55.770 --> 01:00:57.960 That certainly comes with a price. 01:00:57.960 --> 01:01:01.556 If you want to ship some new and improved tool that you 01:01:01.556 --> 01:01:03.430 want to make available to the world or a game 01:01:03.430 --> 01:01:05.471 or any other piece of mobile software, well, it'd 01:01:05.471 --> 01:01:09.120 be nice to have it available to all users with smartphones. 01:01:09.120 --> 01:01:11.670 But then you're going to have to know how to program in Java. 01:01:11.670 --> 01:01:14.503 You're going to have to know how to program in Swift or Objective-C. 01:01:14.503 --> 01:01:18.720 Or you're going to have to know how to take this hybrid approach of developing 01:01:18.720 --> 01:01:24.240 it using JavaScript and HTML and CSS, but there, too, there's a tradeoff. 01:01:24.240 --> 01:01:27.540 You tend to get very good performance out of Android applications 01:01:27.540 --> 01:01:31.411 that are natively written in Java and native applications in iOS that 01:01:31.411 --> 01:01:32.910 are written in Objective-C or Swift. 01:01:32.910 --> 01:01:34.470 They just tend to be very responsive. 01:01:34.470 --> 01:01:36.900 They tend to follow a very similar paradigm. 01:01:36.900 --> 01:01:39.810 Menus and buttons and so forth all tend to look and feel the same 01:01:39.810 --> 01:01:41.640 and therefore be familiar to users. 01:01:41.640 --> 01:01:43.290 And they're very responsive. 01:01:43.290 --> 01:01:45.040 Touch a button, something happens quickly. 01:01:45.040 --> 01:01:47.640 There doesn't seem typically to be much latency. 01:01:47.640 --> 01:01:51.480 In hybrid applications, that are really written in JavaScript, HTML, and CSS, 01:01:51.480 --> 01:01:55.410 especially if they're technically server side hosted on your servers 01:01:55.410 --> 01:01:57.330 or in some cloud server, they might actually 01:01:57.330 --> 01:02:00.780 feel a little slower because there's a whole internet between you 01:02:00.780 --> 01:02:02.280 and your user's experience. 01:02:02.280 --> 01:02:05.571 Or they might have to download more data that would be better to just bundle up 01:02:05.571 --> 01:02:08.100 in the application itself where the menus and the buttons 01:02:08.100 --> 01:02:15.210 they don't quite feel as native as the default Android and iOS user 01:02:15.210 --> 01:02:16.042 experiences. 01:02:16.042 --> 01:02:17.250 So a bit of a tradeoff there. 01:02:17.250 --> 01:02:21.120 OK, so if you don't want to pay that penalty of performance and perception, 01:02:21.120 --> 01:02:23.570 implement the Android app and the iOS app. 01:02:23.570 --> 01:02:27.810 But now you need two developers or one developer who knows both platforms. 01:02:27.810 --> 01:02:33.910 So that, too, comes with a cost, both in time or salary or talent or the like. 01:02:33.910 --> 01:02:36.480 So there, too, it's not obvious how to go. 01:02:36.480 --> 01:02:41.370 And even more recently are there frameworks like Cordova, Ionic, Meteor, 01:02:41.370 --> 01:02:44.160 React Native, Supersonic, Xamarin, and more 01:02:44.160 --> 01:02:48.690 that actually offer yet a fourth option whereby you implement 01:02:48.690 --> 01:02:52.650 your application in some neutral language like JavaScript, 01:02:52.650 --> 01:02:57.210 and then using these frameworks, these tools that other people have kindly 01:02:57.210 --> 01:02:59.610 or commercially developed for us to use, you 01:02:59.610 --> 01:03:05.010 can essentially convert or translate that middle language JavaScript 01:03:05.010 --> 01:03:10.890 to Objective C or to Swift or to Java or really to the underlying code that 01:03:10.890 --> 01:03:13.530 gets shipped ultimately to the app stores 01:03:13.530 --> 01:03:17.040 so that you can actually develop native applications 01:03:17.040 --> 01:03:18.960 but in an intermediate language. 01:03:18.960 --> 01:03:21.900 But there the learning curve might be a little bit a little higher. 01:03:21.900 --> 01:03:23.940 Indeed, the menu of options is even longer 01:03:23.940 --> 01:03:26.110 than the list of native languages itself. 01:03:26.110 --> 01:03:30.870 So that requires some learning curve or some time or some talent or money 01:03:30.870 --> 01:03:31.510 again. 01:03:31.510 --> 01:03:33.590 And so there too were there are some tradeoffs. 01:03:33.590 --> 01:03:36.660 And so it really depends ultimately on what is your application 01:03:36.660 --> 01:03:39.750 and who you have working with you and what is most important 01:03:39.750 --> 01:03:41.580 and how much time do you have and what do 01:03:41.580 --> 01:03:45.870 you view the technological horizon looking like some months ahead? 01:03:45.870 --> 01:03:49.950 So at the end of the day, these technology stacks, as they're called, 01:03:49.950 --> 01:03:52.360 are really just menus of options. 01:03:52.360 --> 01:03:55.590 And those menus are constantly evolving, and they focus on the front 01:03:55.590 --> 01:03:57.720 and on the back and on the server, on the client, 01:03:57.720 --> 01:03:59.610 on mobile devices, laptops, and desktops. 01:03:59.610 --> 01:04:01.800 There are solutions to any number of problems. 01:04:01.800 --> 01:04:04.950 Indeed, the process of software engineering and developing a product 01:04:04.950 --> 01:04:07.620 and developing a web app or a native application 01:04:07.620 --> 01:04:11.610 itself is first doing some due diligence and bringing yourself up to speed 01:04:11.610 --> 01:04:14.245 on what the design possibilities are, having a discussion, 01:04:14.245 --> 01:04:17.120 having a debate even, with the engineers with whom you'll be working. 01:04:17.120 --> 01:04:19.730 And ultimately making the most informed decision that you can 01:04:19.730 --> 01:04:22.430 with an eye toward what is trending now, what has been trending, 01:04:22.430 --> 01:04:25.100 and where the industry might be going but ultimately 01:04:25.100 --> 01:04:28.400 focusing on solving optimally your own problems 01:04:28.400 --> 01:04:33.750 and choosing among these various and ever-changing technology stacks.