1 00:00:00,000 --> 00:00:03,920 [MUSIC PLAYING] 2 00:00:03,920 --> 00:00:15,537 3 00:00:15,537 --> 00:00:18,870 BRIAN YU: Welcome back, everyone, to Web Programming with Python and JavaScript. 4 00:00:18,870 --> 00:00:21,940 And today we're going to look at things from a different perspective. 5 00:00:21,940 --> 00:00:25,110 So we've spent the past several weeks working on designing and building 6 00:00:25,110 --> 00:00:27,990 and programming web applications using Python and JavaScript. 7 00:00:27,990 --> 00:00:30,630 We've talked about using frameworks like Flask and Django 8 00:00:30,630 --> 00:00:33,660 in order to actually write the code that will run on our web servers 9 00:00:33,660 --> 00:00:35,940 and then writing JavaScript code that runs on the client side 10 00:00:35,940 --> 00:00:39,000 inside of a user's browser in order to allow for additional functionality 11 00:00:39,000 --> 00:00:39,750 to happen. 12 00:00:39,750 --> 00:00:43,200 But the focus today is going to be less about actually writing the web 13 00:00:43,200 --> 00:00:46,350 applications but what happens after you've written those web applications. 14 00:00:46,350 --> 00:00:48,090 And you want to take your web application 15 00:00:48,090 --> 00:00:49,140 and deploy it to the internet. 16 00:00:49,140 --> 00:00:51,140 After you've written it, after you've tested it, 17 00:00:51,140 --> 00:00:54,750 what concerns have to be considered when we think about taking that web 18 00:00:54,750 --> 00:00:56,970 application and then deploying it? 19 00:00:56,970 --> 00:00:59,970 And the main focus of today is going to be all about scalability, 20 00:00:59,970 --> 00:01:03,810 this idea of a web application might work well if just a couple of users 21 00:01:03,810 --> 00:01:04,455 are using it. 22 00:01:04,455 --> 00:01:07,080 But what happens when the web application starts to get popular 23 00:01:07,080 --> 00:01:08,400 as more people start to use it? 24 00:01:08,400 --> 00:01:12,270 And as your application starts to have to deal with multiple different people 25 00:01:12,270 --> 00:01:15,270 potentially accessing your data at the same time 26 00:01:15,270 --> 00:01:17,970 and trying to use your application at the same time, what 27 00:01:17,970 --> 00:01:21,630 sorts of considerations do you need to take into account when that starts 28 00:01:21,630 --> 00:01:22,740 to happen? 29 00:01:22,740 --> 00:01:24,900 And so we can begin with a simple picture. 30 00:01:24,900 --> 00:01:28,380 We might imagine that this diagram here represents your web server. 31 00:01:28,380 --> 00:01:30,450 And when a user comes along, that user is 32 00:01:30,450 --> 00:01:32,580 going to be connecting to that web server somehow. 33 00:01:32,580 --> 00:01:34,371 They're going to connect to the web server. 34 00:01:34,371 --> 00:01:37,320 Your server, which might be running Flask or Django or some other web 35 00:01:37,320 --> 00:01:40,080 framework, is going to need to process that request, 36 00:01:40,080 --> 00:01:43,020 figure out what sort of response to present back to the user, 37 00:01:43,020 --> 00:01:45,210 and then deliver that request back to the user. 38 00:01:45,210 --> 00:01:51,270 But a server can only do finitely many things per second. 39 00:01:51,270 --> 00:01:54,030 How do we typically measure how many things a server 40 00:01:54,030 --> 00:01:55,620 can do in a given amount of time? 41 00:01:55,620 --> 00:01:57,540 Any idea what the metric for that usually is? 42 00:01:57,540 --> 00:02:00,797 43 00:02:00,797 --> 00:02:03,130 So the standard metric for that is a unit of measurement 44 00:02:03,130 --> 00:02:06,070 called hertz, which represents the number of calculations 45 00:02:06,070 --> 00:02:07,900 that a computer can do in a given second. 46 00:02:07,900 --> 00:02:09,941 Or more commonly, as we hear nowadays, gigahertz, 47 00:02:09,941 --> 00:02:12,400 or billions of computations, which are very simple 48 00:02:12,400 --> 00:02:16,840 computations like adding two numbers together or checking whether or not 49 00:02:16,840 --> 00:02:17,980 a number is equal to zero. 50 00:02:17,980 --> 00:02:21,190 Simple calculations like that amassed over billions and billions 51 00:02:21,190 --> 00:02:23,050 of computations is generally the way we'll 52 00:02:23,050 --> 00:02:25,870 measure how many things a server can do at the same time in a given 53 00:02:25,870 --> 00:02:28,690 amount of time, like a period of one second, for instance. 54 00:02:28,690 --> 00:02:32,414 And so a given server can only do finitely many number of things 55 00:02:32,414 --> 00:02:34,330 in a given amount of time-- in a given second, 56 00:02:34,330 --> 00:02:36,250 for instance-- which means that there are only 57 00:02:36,250 --> 00:02:39,190 a finitely many number of users that a server could potentially 58 00:02:39,190 --> 00:02:40,780 respond to in a given second. 59 00:02:40,780 --> 00:02:45,280 And so if a server can only respond to 100 users in a given second, what 60 00:02:45,280 --> 00:02:48,400 happens when user number 101 comes along and tries 61 00:02:48,400 --> 00:02:50,670 to make a request to the server in that same second? 62 00:02:50,670 --> 00:02:52,420 How is the server going to deal with that? 63 00:02:52,420 --> 00:02:54,970 And these issues surrounding scalability are the issues 64 00:02:54,970 --> 00:02:56,050 that we're going to be exploring today. 65 00:02:56,050 --> 00:02:59,216 What happens when it's not just one user are trying to connect to our server 66 00:02:59,216 --> 00:03:02,560 but potentially many users that are all trying to connect to our server 67 00:03:02,560 --> 00:03:03,890 at the same time? 68 00:03:03,890 --> 00:03:06,670 So what are some ideas for how we might deal with this situation? 69 00:03:06,670 --> 00:03:09,520 Our server is a finite machine that can only 70 00:03:09,520 --> 00:03:11,432 deal with so many users per second. 71 00:03:11,432 --> 00:03:13,390 And suddenly we find that our web application's 72 00:03:13,390 --> 00:03:16,210 gotten popular enough that we have more than that number 73 00:03:16,210 --> 00:03:19,000 of users trying to access our application at the same time. 74 00:03:19,000 --> 00:03:23,160 What might we want to do about that? 75 00:03:23,160 --> 00:03:27,516 AUDIENCE: You could add more memory and resources to the actual server 76 00:03:27,516 --> 00:03:29,814 to make it a beefier kind of machine. 77 00:03:29,814 --> 00:03:30,730 BRIAN YU: Yeah, great. 78 00:03:30,730 --> 00:03:32,970 We could add more resources to the server we have. 79 00:03:32,970 --> 00:03:34,660 We can add more memory to the server. 80 00:03:34,660 --> 00:03:37,860 We can, in other words, try and make the server faster, increase 81 00:03:37,860 --> 00:03:40,169 the processing power of that server. 82 00:03:40,169 --> 00:03:42,460 And so this is something we might-- well, first of all, 83 00:03:42,460 --> 00:03:44,710 before we get there, I'll talk a little bit about benchmarking. 84 00:03:44,710 --> 00:03:46,543 So benchmarking is something you'll probably 85 00:03:46,543 --> 00:03:49,020 want to do first, this process of figuring out just how 86 00:03:49,020 --> 00:03:51,150 much your server can actually handle. 87 00:03:51,150 --> 00:03:54,180 Your server has a maximum capacity, but you might not 88 00:03:54,180 --> 00:03:57,180 know upfront just what that capacity is, just 89 00:03:57,180 --> 00:03:58,920 how many users your server can handle. 90 00:03:58,920 --> 00:04:02,891 And it's probably not a good idea to go about waiting until that server hits 91 00:04:02,891 --> 00:04:05,640 the capacity, until you've reached the point where your server can 92 00:04:05,640 --> 00:04:08,223 no longer handle any more users, before you realize, oh, yeah, 93 00:04:08,223 --> 00:04:09,540 that's what the capacity is. 94 00:04:09,540 --> 00:04:11,456 And so benchmarking is something you'll likely 95 00:04:11,456 --> 00:04:15,210 want to do first in order to load test or stress test, as it's often called-- 96 00:04:15,210 --> 00:04:18,351 testing those servers in order to make sure you know what the limit is. 97 00:04:18,351 --> 00:04:20,100 And once you know that, then you can start 98 00:04:20,100 --> 00:04:22,812 to think about what to do if you were to ever exceed that limit. 99 00:04:22,812 --> 00:04:25,020 And so the idea that was brought up here is something 100 00:04:25,020 --> 00:04:26,561 that we might call vertical scaling-- 101 00:04:26,561 --> 00:04:30,060 this idea that if our server as it is now isn't good enough, 102 00:04:30,060 --> 00:04:31,920 isn't performant enough in order to handle 103 00:04:31,920 --> 00:04:36,120 all of the users that might be coming in order to use our web application, 104 00:04:36,120 --> 00:04:39,180 then what we might want to do is scale that server up and make it 105 00:04:39,180 --> 00:04:43,254 a larger server, for instance, that is able to have more processing capacity, 106 00:04:43,254 --> 00:04:45,420 that's able to operate faster, that has more memory, 107 00:04:45,420 --> 00:04:48,607 for instance, that can then allow it to handle that additional capacity. 108 00:04:48,607 --> 00:04:51,690 So you might imagine that if this is our server and this is its connection 109 00:04:51,690 --> 00:04:54,356 and we realize that more and more connections are going to start 110 00:04:54,356 --> 00:04:55,830 coming in, what do we need to do? 111 00:04:55,830 --> 00:04:58,710 We can vertically scale that server, make it more performant 112 00:04:58,710 --> 00:05:01,680 by adding more memory, for instance, to that server in order to allow 113 00:05:01,680 --> 00:05:03,970 it to respond to that sort of thing. 114 00:05:03,970 --> 00:05:07,130 So what are the drawbacks or limitations of vertical scaling? 115 00:05:07,130 --> 00:05:08,880 Where might we go wrong with this process, 116 00:05:08,880 --> 00:05:12,150 or why is it not a perfect solution to all of our problems 117 00:05:12,150 --> 00:05:15,984 when it comes to scalability? 118 00:05:15,984 --> 00:05:20,269 AUDIENCE: Well, it's not, I mean, I guess, for lack of a better word, 119 00:05:20,269 --> 00:05:22,810 it's not very scalable because you just have this one machine 120 00:05:22,810 --> 00:05:24,770 that you're trying to make bigger and bigger. 121 00:05:24,770 --> 00:05:30,255 At some point, it's probably going to get really expensive or impossible. 122 00:05:30,255 --> 00:05:31,130 BRIAN YU: Yeah, sure. 123 00:05:31,130 --> 00:05:33,756 So this is maybe not as scalable as we would like. 124 00:05:33,756 --> 00:05:35,630 The idea might be that eventually we're going 125 00:05:35,630 --> 00:05:38,090 to hit a point where it's going to be impossible to just keep 126 00:05:38,090 --> 00:05:41,131 getting a bigger and bigger server because with a single server, wherever 127 00:05:41,131 --> 00:05:44,480 we're getting the server from, there's probably a maximum processing power 128 00:05:44,480 --> 00:05:46,130 they can put inside of a single server. 129 00:05:46,130 --> 00:05:49,280 And so we're eventually going to hit some sort of limit on vertical scaling 130 00:05:49,280 --> 00:05:51,770 where our servers can only get so powerful inside of just 131 00:05:51,770 --> 00:05:53,010 a single server. 132 00:05:53,010 --> 00:05:54,379 So what might we do then? 133 00:05:54,379 --> 00:05:56,670 How can we-- if we still need to scale our application, 134 00:05:56,670 --> 00:05:58,640 still need to deal with more users that are all 135 00:05:58,640 --> 00:06:00,800 trying to access the application at the same time, 136 00:06:00,800 --> 00:06:03,560 and we can't just keep growing this one server, what 137 00:06:03,560 --> 00:06:05,757 do we do in that situation? 138 00:06:05,757 --> 00:06:07,130 AUDIENCE: Get another server. 139 00:06:07,130 --> 00:06:07,970 BRIAN YU: Get another server. 140 00:06:07,970 --> 00:06:08,520 Great. 141 00:06:08,520 --> 00:06:10,269 And so if this is called vertical scaling, 142 00:06:10,269 --> 00:06:13,610 this idea of taking our existing server and adding more processing 143 00:06:13,610 --> 00:06:16,730 power to it in order to make it more performant, than adding more servers 144 00:06:16,730 --> 00:06:18,440 we might call horizontal scaling. 145 00:06:18,440 --> 00:06:21,770 The idea there being that if we have a single server previously 146 00:06:21,770 --> 00:06:24,970 and now we want to be able to handle more load coming from more places, 147 00:06:24,970 --> 00:06:26,540 then instead of just having one server, maybe we 148 00:06:26,540 --> 00:06:29,180 think about splitting this up now into two different servers 149 00:06:29,180 --> 00:06:32,960 where each of the servers is able to handle users, able to process requests, 150 00:06:32,960 --> 00:06:38,100 and deal with users that are coming in in that sense as well. 151 00:06:38,100 --> 00:06:42,080 But what problems might arise now, now that we have two servers 152 00:06:42,080 --> 00:06:45,095 that we're trying to run our web application on? 153 00:06:45,095 --> 00:06:47,945 AUDIENCE: You still want one database, so if they're 154 00:06:47,945 --> 00:06:50,320 trying to write the same database, something like that. 155 00:06:50,320 --> 00:06:51,932 There might be risk condition. 156 00:06:51,932 --> 00:06:54,890 BRIAN YU: Great so one potential concern is what happens with the data. 157 00:06:54,890 --> 00:06:56,389 We have a database somewhere, right? 158 00:06:56,389 --> 00:06:59,120 We might be in a PostgreSQL database, like we did in project one, 159 00:06:59,120 --> 00:07:01,449 for instance, where both of these servers 160 00:07:01,449 --> 00:07:02,990 need to somehow access that database. 161 00:07:02,990 --> 00:07:05,516 And maybe they're accessing the database at the same time, 162 00:07:05,516 --> 00:07:07,140 and concerns might arise there as well. 163 00:07:07,140 --> 00:07:11,510 And we'll talk about how to deal with scaling our databases later on today as 164 00:07:11,510 --> 00:07:12,232 well. 165 00:07:12,232 --> 00:07:13,190 What else might happen? 166 00:07:13,190 --> 00:07:16,837 That's certainly something that might come up. 167 00:07:16,837 --> 00:07:18,920 What initial challenge might come up if a user now 168 00:07:18,920 --> 00:07:22,406 tries to access my web application? 169 00:07:22,406 --> 00:07:24,576 AUDIENCE: They don't know which server to go to. 170 00:07:24,576 --> 00:07:25,270 BRIAN YU: Great. 171 00:07:25,270 --> 00:07:27,650 The user doesn't really know which server to go to. 172 00:07:27,650 --> 00:07:30,020 We somehow need to have some way of figuring out 173 00:07:30,020 --> 00:07:32,720 if a user comes in, do we send them to this server over here 174 00:07:32,720 --> 00:07:34,230 or this server over there. 175 00:07:34,230 --> 00:07:36,025 So how do we address that problem? 176 00:07:36,025 --> 00:07:38,900 And so oftentimes this is addressed through another piece of hardware 177 00:07:38,900 --> 00:07:43,364 that sits in between the user and the server, which we call a load balancer. 178 00:07:43,364 --> 00:07:46,280 And the load balancer's job is effectively to solve that very problem, 179 00:07:46,280 --> 00:07:47,900 to wait for a user to come in. 180 00:07:47,900 --> 00:07:50,540 And the load balancer simply is going to try and detect, 181 00:07:50,540 --> 00:07:52,454 when the user comes in, what should happen. 182 00:07:52,454 --> 00:07:54,620 Should we send the user to this server, or should we 183 00:07:54,620 --> 00:07:56,030 send the user to that server? 184 00:07:56,030 --> 00:07:58,030 And load balancer needs to make those decisions. 185 00:07:58,030 --> 00:08:00,710 So when the user comes in, they send them to either one server 186 00:08:00,710 --> 00:08:02,741 or to another server. 187 00:08:02,741 --> 00:08:04,990 So how might a load balancer make those decisions now? 188 00:08:04,990 --> 00:08:06,830 So somehow the load balancer needs to decide. 189 00:08:06,830 --> 00:08:07,413 User comes in. 190 00:08:07,413 --> 00:08:10,190 Do we send them to server A or server B? 191 00:08:10,190 --> 00:08:14,000 What strategies or algorithms might a load balancer 192 00:08:14,000 --> 00:08:17,877 want to employ in order to determine which of the different servers 193 00:08:17,877 --> 00:08:18,710 to send the user to? 194 00:08:18,710 --> 00:08:20,876 Maybe it's going to be only two, as in this diagram. 195 00:08:20,876 --> 00:08:23,390 But maybe in the case of an even larger web application, 196 00:08:23,390 --> 00:08:24,930 we have scaled it up to more than two servers. 197 00:08:24,930 --> 00:08:27,110 There are three, four, five, or even more servers 198 00:08:27,110 --> 00:08:31,089 that the load balancer needs to decide which one should the user go to. 199 00:08:31,089 --> 00:08:32,630 And there are many potential answers. 200 00:08:32,630 --> 00:08:36,449 But what are some possibilities for what the load balancer could be doing here? 201 00:08:36,449 --> 00:08:38,270 AUDIENCE: I've heard of round robining. 202 00:08:38,270 --> 00:08:38,870 BRIAN YU: Round robining. 203 00:08:38,870 --> 00:08:39,230 Great. 204 00:08:39,230 --> 00:08:41,960 So that if I have five different servers, we take the first user, 205 00:08:41,960 --> 00:08:42,830 send them to server one. 206 00:08:42,830 --> 00:08:44,780 Take the second user, send them to server two. 207 00:08:44,780 --> 00:08:46,640 Third user goes to server three, then four, then five. 208 00:08:46,640 --> 00:08:48,620 And when the next one comes in, we can send him back to one, 209 00:08:48,620 --> 00:08:51,860 just sort of alternating and circling between all of the possible servers 210 00:08:51,860 --> 00:08:52,720 that we have. 211 00:08:52,720 --> 00:08:53,960 It's certainly an option. 212 00:08:53,960 --> 00:08:57,510 Other choices that we might have? 213 00:08:57,510 --> 00:08:58,010 Yeah. 214 00:08:58,010 --> 00:09:00,010 AUDIENCE: Probably communicate with the server, see if it's busy. 215 00:09:00,010 --> 00:09:03,510 And then if it's busy, then don't send anything to it, something like that. 216 00:09:03,510 --> 00:09:05,218 BRIAN YU: Yeah, so we can try potentially 217 00:09:05,218 --> 00:09:06,930 communicating with the servers maybe. 218 00:09:06,930 --> 00:09:09,766 Server number one has a lot of users on it right now, 219 00:09:09,766 --> 00:09:11,640 and server number two doesn't have very many. 220 00:09:11,640 --> 00:09:14,310 If we could somehow get the servers to tell us 221 00:09:14,310 --> 00:09:16,620 what the current load is, how many users are currently 222 00:09:16,620 --> 00:09:19,652 using either one of those servers, then maybe our load balancer can 223 00:09:19,652 --> 00:09:21,360 be intelligent about that and figure out, 224 00:09:21,360 --> 00:09:24,930 well, if server number two doesn't have very many users using it right now, 225 00:09:24,930 --> 00:09:28,116 then we may as well direct more traffic there in order to help to, 226 00:09:28,116 --> 00:09:29,990 as the load balancer name might imply, trying 227 00:09:29,990 --> 00:09:32,580 to balance out the load on each of these two servers 228 00:09:32,580 --> 00:09:36,510 so that no one is facing a lot more in terms of resource usage 229 00:09:36,510 --> 00:09:38,550 than the other server is. 230 00:09:38,550 --> 00:09:40,440 Other ideas that we might throw out there? 231 00:09:40,440 --> 00:09:43,390 232 00:09:43,390 --> 00:09:43,890 OK. 233 00:09:43,890 --> 00:09:45,780 So those are some of the basic strategies 234 00:09:45,780 --> 00:09:47,820 that might come into play when we think about load balancing. 235 00:09:47,820 --> 00:09:49,964 One very simple option might just be random choice 236 00:09:49,964 --> 00:09:52,630 where just, when the user comes in, you effectively flip a coin. 237 00:09:52,630 --> 00:09:54,500 If it's heads, send them to server A. If it's tails, 238 00:09:54,500 --> 00:09:57,583 send them to server B, where we just try to randomly and evenly distribute 239 00:09:57,583 --> 00:09:58,110 people. 240 00:09:58,110 --> 00:10:01,440 Round robin is certainly an option, where you circle amongst the servers 241 00:10:01,440 --> 00:10:02,400 that you do have. 242 00:10:02,400 --> 00:10:04,650 And then you have this idea of fewest connections, 243 00:10:04,650 --> 00:10:08,610 where you check the servers and figure out which one has the least load 244 00:10:08,610 --> 00:10:11,790 and try to send the user that comes in to the server that has the least 245 00:10:11,790 --> 00:10:13,650 load at that particular time. 246 00:10:13,650 --> 00:10:16,560 And what might be some of the drawbacks or benefits 247 00:10:16,560 --> 00:10:18,270 of these compared to each other? 248 00:10:18,270 --> 00:10:20,100 If fewest connections seems to make sense, 249 00:10:20,100 --> 00:10:23,670 where if server A is less busy than server B, 250 00:10:23,670 --> 00:10:26,657 then it makes sense to send the user to server A, why might we-- 251 00:10:26,657 --> 00:10:28,740 what might be a drawback of that approach compared 252 00:10:28,740 --> 00:10:31,553 to a random choice or a round robin-like approach? 253 00:10:31,553 --> 00:10:34,178 What are the trade-offs that we face when making that decision? 254 00:10:34,178 --> 00:10:36,668 AUDIENCE: It can depend on what people are actually doing. 255 00:10:36,668 --> 00:10:39,158 So even though there may be few connections on one server, 256 00:10:39,158 --> 00:10:42,644 there may be seven people that are actually 257 00:10:42,644 --> 00:10:45,305 using a lot of the server's resources for something. 258 00:10:45,305 --> 00:10:45,930 BRIAN YU: Sure. 259 00:10:45,930 --> 00:10:48,600 The number of users that are using a particular server 260 00:10:48,600 --> 00:10:51,510 might not be a perfect proxy for how much load 261 00:10:51,510 --> 00:10:52,920 that server is actually facing. 262 00:10:52,920 --> 00:10:55,110 Because if there are a hundred users on server one 263 00:10:55,110 --> 00:10:57,450 but they're really just looking at a couple static pages 264 00:10:57,450 --> 00:11:00,150 and aren't doing anything very computationally intensive, 265 00:11:00,150 --> 00:11:02,339 but people on server B, there are fewer of them 266 00:11:02,339 --> 00:11:04,380 but they're really doing more work, then maybe we 267 00:11:04,380 --> 00:11:07,120 would prefer to send someone to server A instead, for instance. 268 00:11:07,120 --> 00:11:09,720 So number of users or number of connections 269 00:11:09,720 --> 00:11:13,920 might not be the perfect way of measuring how much activity is going on 270 00:11:13,920 --> 00:11:14,710 in the servers. 271 00:11:14,710 --> 00:11:17,520 And you can imagine that we might try and make our load balancing 272 00:11:17,520 --> 00:11:20,780 algorithms more sophisticated or more complex by trying to figure out, well, 273 00:11:20,780 --> 00:11:22,830 really just how much is the load on each of these 274 00:11:22,830 --> 00:11:25,120 and figure out what would really make more sense. 275 00:11:25,120 --> 00:11:27,990 But then what sorts of issues start to come up there? 276 00:11:27,990 --> 00:11:31,350 What's the trade-off that we face there? 277 00:11:31,350 --> 00:11:31,850 Yeah. 278 00:11:31,850 --> 00:11:35,944 AUDIENCE: Well, now load balancing's going to become expensive [INAUDIBLE].. 279 00:11:35,944 --> 00:11:36,610 BRIAN YU: Great. 280 00:11:36,610 --> 00:11:38,740 Now load balancing starts to become more expensive. 281 00:11:38,740 --> 00:11:41,698 But if we want the user to be able to get a fast response from server A 282 00:11:41,698 --> 00:11:44,270 or server B, we've now introduced this intermediary piece 283 00:11:44,270 --> 00:11:46,190 of hardware, this load balancer, that's going 284 00:11:46,190 --> 00:11:49,315 to have to spend time calculating and processing which of these two servers 285 00:11:49,315 --> 00:11:52,119 is actually going to be the better server to send the user to. 286 00:11:52,119 --> 00:11:53,660 And it's going to take time, latency. 287 00:11:53,660 --> 00:11:56,720 It's going to take some computational power in order to figure out 288 00:11:56,720 --> 00:11:58,249 where to ultimately send that user. 289 00:11:58,249 --> 00:12:00,290 And so there's definitely that trade-off as well, 290 00:12:00,290 --> 00:12:03,290 whereas in a random choice, a round robin type model, 291 00:12:03,290 --> 00:12:05,480 we can save a lot of that computational energy 292 00:12:05,480 --> 00:12:07,610 by not worrying about which of these servers 293 00:12:07,610 --> 00:12:09,800 is more busy or less busy at any given time 294 00:12:09,800 --> 00:12:12,470 and just send the user to a particular server 295 00:12:12,470 --> 00:12:14,810 without needing to do those sorts of computation. 296 00:12:14,810 --> 00:12:18,890 And so in practice, there's no one best solution to these problems. 297 00:12:18,890 --> 00:12:21,790 But it's good to be thinking about different ways in which your load 298 00:12:21,790 --> 00:12:24,650 balancer might be operating in order to think about what algorithm 299 00:12:24,650 --> 00:12:28,369 you might want to use depending on the specific needs of your web application. 300 00:12:28,369 --> 00:12:30,410 But in general, when we deal with load balancing, 301 00:12:30,410 --> 00:12:34,190 if we think of this idea of user tries to access your website, 302 00:12:34,190 --> 00:12:37,220 with every request, that requests first goes to the load balancer 303 00:12:37,220 --> 00:12:38,930 before it goes to the web server. 304 00:12:38,930 --> 00:12:41,420 And at the load balancer stage, the load balancer 305 00:12:41,420 --> 00:12:44,750 makes a decision about send the user to server A 306 00:12:44,750 --> 00:12:47,420 or send the user to server B. What problems 307 00:12:47,420 --> 00:12:48,831 might occur with just that model? 308 00:12:48,831 --> 00:12:51,830 Even if you don't worry about which specific algorithm the load balancer 309 00:12:51,830 --> 00:12:55,790 is using to determine where to send the user each time, what could go wrong? 310 00:12:55,790 --> 00:13:00,640 311 00:13:00,640 --> 00:13:03,615 AUDIENCE: Some users might be doing more than others on a server. 312 00:13:03,615 --> 00:13:04,240 BRIAN YU: Sure. 313 00:13:04,240 --> 00:13:07,073 So certainly some users might be doing more than others on a server. 314 00:13:07,073 --> 00:13:10,940 And in particular, when we think about what users are doing on a server, 315 00:13:10,940 --> 00:13:16,400 the user is oftentimes not just going to one page and letting it be at that. 316 00:13:16,400 --> 00:13:18,920 A user might be trying to access a page more than one time 317 00:13:18,920 --> 00:13:22,380 or going to multiple different pages on the same web application, for instance. 318 00:13:22,380 --> 00:13:27,404 You might imagine on a e-commerce site like eBay or Amazon, for instance, 319 00:13:27,404 --> 00:13:29,570 a user might be adding things to their shopping cart 320 00:13:29,570 --> 00:13:32,060 and looking at other pages and adding new things to their shopping cart 321 00:13:32,060 --> 00:13:34,518 and interacting with a web page in multiple different ways, 322 00:13:34,518 --> 00:13:35,870 making multiple requests. 323 00:13:35,870 --> 00:13:39,132 And what could go wrong now is every time a user makes a request, 324 00:13:39,132 --> 00:13:41,840 the load balancer is making a new decision about send to server A 325 00:13:41,840 --> 00:13:43,088 or send to server B. 326 00:13:43,088 --> 00:13:44,796 AUDIENCE: Yeah, that would be really bad. 327 00:13:44,796 --> 00:13:48,787 So the load balancer would have to have some kind of session awareness, 328 00:13:48,787 --> 00:13:49,287 I guess. 329 00:13:49,287 --> 00:13:49,787 Right? 330 00:13:49,787 --> 00:13:53,854 So it send somebody in one server and it just keep sending that same person. 331 00:13:53,854 --> 00:13:54,520 BRIAN YU: Right. 332 00:13:54,520 --> 00:13:57,478 So a problem might occur where without-- with just some basic algorithm 333 00:13:57,478 --> 00:13:59,719 like this where every request we make a decision, 334 00:13:59,719 --> 00:14:01,510 we don't have any sort of session awareness 335 00:14:01,510 --> 00:14:05,060 that if a user comes into the web application and is sent to server A, 336 00:14:05,060 --> 00:14:08,440 and we now store the contents of their shopping cart on server A. 337 00:14:08,440 --> 00:14:10,030 And the user clicks on another page. 338 00:14:10,030 --> 00:14:12,280 And then the load balancer this time-- either because 339 00:14:12,280 --> 00:14:13,900 of a random choice, a round robin, or because 340 00:14:13,900 --> 00:14:15,850 of new number has the fewest connections-- 341 00:14:15,850 --> 00:14:18,430 decides to send that user to server B instead. 342 00:14:18,430 --> 00:14:21,130 That new server might not have-- doesn't have the same session 343 00:14:21,130 --> 00:14:22,900 data that this original server did. 344 00:14:22,900 --> 00:14:26,140 And so maybe now the user's shopping cart's totally empty, for instance. 345 00:14:26,140 --> 00:14:29,710 And so by introducing this attempted benefit 346 00:14:29,710 --> 00:14:33,320 of splitting the server into two parts, horizontally scaling into a server A 347 00:14:33,320 --> 00:14:35,680 and server B, we now need to worry about when 348 00:14:35,680 --> 00:14:38,560 the user comes to serve A the first time, what should happen 349 00:14:38,560 --> 00:14:40,060 when they come back the second time. 350 00:14:40,060 --> 00:14:44,080 Maybe we do want the user to go back to server A again. 351 00:14:44,080 --> 00:14:47,840 So this brings into the idea of session-aware load 352 00:14:47,840 --> 00:14:49,900 balancing-- this idea that when we load balance, 353 00:14:49,900 --> 00:14:52,360 it's often going to be a good idea to make sure 354 00:14:52,360 --> 00:14:55,060 that our load-balancing algorithm is somehow session-aware, 355 00:14:55,060 --> 00:14:58,150 that it knows that when a user comes back to the site, 356 00:14:58,150 --> 00:15:00,722 that they should be directed potentially to the same server. 357 00:15:00,722 --> 00:15:02,680 And that's this first idea of sticky sessions-- 358 00:15:02,680 --> 00:15:05,710 that if user comes to the web application the first time 359 00:15:05,710 --> 00:15:10,210 and is directed to server A, then when the user comes back for a second time, 360 00:15:10,210 --> 00:15:13,060 even if random choice chose server B or even 361 00:15:13,060 --> 00:15:15,150 if based on looking at number of connections 362 00:15:15,150 --> 00:15:18,580 server B is less loaded than server A, and we would normally send the user 363 00:15:18,580 --> 00:15:22,330 to server B, we still want to send that user back to server A 364 00:15:22,330 --> 00:15:24,720 because that's the server that they were on previously. 365 00:15:24,720 --> 00:15:26,980 That's where all their session information is. 366 00:15:26,980 --> 00:15:30,577 And so if we want to make sure that the contents of the user's shopping cart 367 00:15:30,577 --> 00:15:33,160 is preserved, for instance, then we'd want to continually send 368 00:15:33,160 --> 00:15:35,710 the user back to server A each time. 369 00:15:35,710 --> 00:15:37,690 So that's the idea of sticky sessions. 370 00:15:37,690 --> 00:15:41,380 How else might we deal with the problem of session-aware load balancing? 371 00:15:41,380 --> 00:15:43,630 Maybe some of these additional bullets can give you 372 00:15:43,630 --> 00:15:45,296 ideas as to how we might deal with that. 373 00:15:45,296 --> 00:15:50,610 374 00:15:50,610 --> 00:15:53,760 So another possibility here is the sessions, actually, in the database. 375 00:15:53,760 --> 00:15:56,720 So it's possible that if right now we're just storing the session 376 00:15:56,720 --> 00:15:59,450 information on the server, then when we split things up 377 00:15:59,450 --> 00:16:01,720 into two different servers, server A and server B, 378 00:16:01,720 --> 00:16:05,930 then any session information on server A isn't accessible on server B. 379 00:16:05,930 --> 00:16:08,180 And so one possibility is store session information 380 00:16:08,180 --> 00:16:10,670 inside of a database, a database that potentially 381 00:16:10,670 --> 00:16:15,020 all of the servers, both server A and server B, both have access to. 382 00:16:15,020 --> 00:16:19,550 And if you do an approach like that where we store information 383 00:16:19,550 --> 00:16:23,660 about our sessions inside of a database, rather than just storing them inside 384 00:16:23,660 --> 00:16:26,030 of server A or server B, then the benefit 385 00:16:26,030 --> 00:16:28,580 there is that no matter which server the user is sent to, 386 00:16:28,580 --> 00:16:30,140 as long as we have a way of taking that user 387 00:16:30,140 --> 00:16:32,570 and identifying which session information in the database 388 00:16:32,570 --> 00:16:35,660 actually belongs to them, then we can extract that session information 389 00:16:35,660 --> 00:16:39,215 out of the database regardless of which server of the user went to. 390 00:16:39,215 --> 00:16:41,090 So what would be a drawback of that approach? 391 00:16:41,090 --> 00:16:44,947 Why might we not want to store session information in the database? 392 00:16:44,947 --> 00:16:49,817 393 00:16:49,817 --> 00:16:51,877 AUDIENCE: Then have to scale your database too. 394 00:16:51,877 --> 00:16:52,710 BRIAN YU: Certainly. 395 00:16:52,710 --> 00:16:54,840 Then we start to get into issues of database scalability, 396 00:16:54,840 --> 00:16:56,970 and we'll talk about database availability too. 397 00:16:56,970 --> 00:16:58,080 And there's also other-- 398 00:16:58,080 --> 00:17:02,114 any time we're introducing additional hardware, additional servers that 399 00:17:02,114 --> 00:17:04,530 are in play when we're dealing with issues of scalability, 400 00:17:04,530 --> 00:17:07,857 then we start to incur time costs, that if originally the session was stored 401 00:17:07,857 --> 00:17:09,690 on the server and now it's stored elsewhere, 402 00:17:09,690 --> 00:17:11,869 now there's still this communication time that needs to happen, 403 00:17:11,869 --> 00:17:14,010 this additional latency that gets added any time 404 00:17:14,010 --> 00:17:15,849 we're trying to access information. 405 00:17:15,849 --> 00:17:18,839 And, finally, you might imagine that we could store the session not 406 00:17:18,839 --> 00:17:22,260 in our web server at all and rather use client-side sessions, storing 407 00:17:22,260 --> 00:17:25,230 information, any information related to the session, actually 408 00:17:25,230 --> 00:17:26,339 inside the client. 409 00:17:26,339 --> 00:17:30,330 And oftentimes this is done through cookies where web servers can just 410 00:17:30,330 --> 00:17:34,770 take cookies and send them to the user where inside of the cookie 411 00:17:34,770 --> 00:17:37,260 stores all of the information related to the session 412 00:17:37,260 --> 00:17:40,590 so that you on your computer are actually storing inside 413 00:17:40,590 --> 00:17:43,767 of your computer all of the information about what's in your shopping cart. 414 00:17:43,767 --> 00:17:46,350 And that cookie is sent along with every web request you make. 415 00:17:46,350 --> 00:17:48,141 So if you make another web request, doesn't 416 00:17:48,141 --> 00:17:50,010 matter which server you're sent to. 417 00:17:50,010 --> 00:17:51,990 Your web request inside of the cookie contains 418 00:17:51,990 --> 00:17:54,060 all of the information that is associated 419 00:17:54,060 --> 00:17:55,800 with that particular section. 420 00:17:55,800 --> 00:18:00,110 And what might be a drawback there? 421 00:18:00,110 --> 00:18:02,550 AUDIENCE: Can that be some sort of attack on the server 422 00:18:02,550 --> 00:18:07,029 where multiple people start using the same cookie to overload the server? 423 00:18:07,029 --> 00:18:08,070 BRIAN YU: Great question. 424 00:18:08,070 --> 00:18:10,920 So there's potentially adversarial ways that this could be used, 425 00:18:10,920 --> 00:18:13,120 that if someone else is sending the same cookie, 426 00:18:13,120 --> 00:18:15,717 then the server might still just accept it and assume 427 00:18:15,717 --> 00:18:16,800 that it's the same person. 428 00:18:16,800 --> 00:18:18,540 So it might come in from different directions. 429 00:18:18,540 --> 00:18:20,500 And, certainly, trying to overload a server 430 00:18:20,500 --> 00:18:23,010 is something we'll talk about when we get to the next topic, which 431 00:18:23,010 --> 00:18:26,260 is all about security and how to think about security in our web applications. 432 00:18:26,260 --> 00:18:28,269 As we begin to scale them larger and larger, 433 00:18:28,269 --> 00:18:30,810 these security issues start to become more and more pressing. 434 00:18:30,810 --> 00:18:33,190 So those are definitely issues to be aware of as well. 435 00:18:33,190 --> 00:18:36,030 So lots of different ways, ultimately, to deal 436 00:18:36,030 --> 00:18:39,660 with these problems of making sure that our load balancer is session-aware, 437 00:18:39,660 --> 00:18:42,600 making sure that when the user comes about that they're consistently 438 00:18:42,600 --> 00:18:44,850 directed either to one place or another or at least 439 00:18:44,850 --> 00:18:47,760 have some mechanism in place for making sure 440 00:18:47,760 --> 00:18:50,807 that any session information-- the contents of the shopping cart or any 441 00:18:50,807 --> 00:18:52,890 notes that they've written in an application-- get 442 00:18:52,890 --> 00:18:56,670 saved when they go to another page, when they make another HTTP 443 00:18:56,670 --> 00:18:59,490 request to the web server. 444 00:18:59,490 --> 00:19:02,039 Questions on anything so far about load balancing 445 00:19:02,039 --> 00:19:03,330 or how we might do any of this? 446 00:19:03,330 --> 00:19:06,220 447 00:19:06,220 --> 00:19:06,720 OK. 448 00:19:06,720 --> 00:19:10,800 So what drawbacks might come about with regards to just horizontal scaling? 449 00:19:10,800 --> 00:19:13,590 That we say, all right, we expect that our web server 450 00:19:13,590 --> 00:19:15,750 will need five web servers, for instance, in order 451 00:19:15,750 --> 00:19:17,650 to deal with traffic on a typical day. 452 00:19:17,650 --> 00:19:21,240 And so now we load balance using some of these session-aware tools, 453 00:19:21,240 --> 00:19:24,840 of deciding between any of these five potential servers 454 00:19:24,840 --> 00:19:26,091 that we need to send users to. 455 00:19:26,091 --> 00:19:27,090 And how might that work? 456 00:19:27,090 --> 00:19:28,980 What problems could come up with that model? 457 00:19:28,980 --> 00:19:42,050 458 00:19:42,050 --> 00:19:46,180 So one thing that I'll talk about briefly that sort of gets at this idea 459 00:19:46,180 --> 00:19:49,177 is the idea that when we define a finite number of servers-- and, say, 460 00:19:49,177 --> 00:19:52,510 there are going to be five servers here, and when a request comes in, it's going 461 00:19:52,510 --> 00:19:54,340 to go to one of those five servers-- 462 00:19:54,340 --> 00:19:57,307 well, you never really know what might happen the next day. 463 00:19:57,307 --> 00:19:59,140 The five servers might be the typical amount 464 00:19:59,140 --> 00:20:01,420 that you would need in order to deal with all of the users that 465 00:20:01,420 --> 00:20:02,830 might come in on a given day. 466 00:20:02,830 --> 00:20:05,950 But you might imagine that some web applications probably 467 00:20:05,950 --> 00:20:08,927 get more traffic at some times of the day than other times of the day 468 00:20:08,927 --> 00:20:12,010 or even some periods of the year as compared to other periods of the year. 469 00:20:12,010 --> 00:20:15,130 You, for instance, might imagine that a shopping website like Amazon 470 00:20:15,130 --> 00:20:17,200 or other online shopping web sites perhaps 471 00:20:17,200 --> 00:20:19,591 get more traffic when it comes to the holiday season, 472 00:20:19,591 --> 00:20:22,090 for instance, than when it comes to other times of the year. 473 00:20:22,090 --> 00:20:26,000 Or you might imagine that a newspaper website, for instance, 474 00:20:26,000 --> 00:20:29,950 after a big presidential election or breaking news event, 475 00:20:29,950 --> 00:20:33,040 maybe a lot more people are accessing that newspaper website as opposed 476 00:20:33,040 --> 00:20:36,206 to during other times of the year when fewer people are looking at the news. 477 00:20:36,206 --> 00:20:40,546 And so the amount of traffic that comes into a web application 478 00:20:40,546 --> 00:20:43,670 could vary depending on the time of day, depending on the time of the year, 479 00:20:43,670 --> 00:20:46,640 depending on random events that happen from time to time. 480 00:20:46,640 --> 00:20:48,950 And so how might a web application deal with that? 481 00:20:48,950 --> 00:20:51,070 Of course, just a finite number of servers 482 00:20:51,070 --> 00:20:54,295 might not be the best solution because potentially if you underestimate 483 00:20:54,295 --> 00:20:56,170 the amount of maximum traffic you might need, 484 00:20:56,170 --> 00:20:59,380 then you might get more users than your servers are able to handle. 485 00:20:59,380 --> 00:21:02,365 And on the flip side, if you just err on the side of too many servers 486 00:21:02,365 --> 00:21:04,990 and just have a lot of servers expecting that in the worst case 487 00:21:04,990 --> 00:21:07,840 you might use all of them, then there is some waste of resources 488 00:21:07,840 --> 00:21:10,480 here, that you're paying, likely, for all of these different servers that 489 00:21:10,480 --> 00:21:13,270 are running when in reality you probably don't need that many. 490 00:21:13,270 --> 00:21:17,740 And so autoscaling is a tool that many cloud computing services now 491 00:21:17,740 --> 00:21:21,460 offer in order to make it such that the number of servers that you're actually 492 00:21:21,460 --> 00:21:26,200 using can scale depending upon traffic, that if more and more traffic comes in, 493 00:21:26,200 --> 00:21:29,320 we can scale up the horizontal scaling of your web application 494 00:21:29,320 --> 00:21:33,430 in order to allow for more different, more web 495 00:21:33,430 --> 00:21:35,110 servers to be added in those times. 496 00:21:35,110 --> 00:21:37,300 So we might start with only two web servers, 497 00:21:37,300 --> 00:21:41,042 but if another web server were to come along, we can add that web server. 498 00:21:41,042 --> 00:21:43,750 And the load balancer knows that now there are three web servers. 499 00:21:43,750 --> 00:21:45,640 And if traffic increases even more, we can 500 00:21:45,640 --> 00:21:48,010 continue to scale our web application. 501 00:21:48,010 --> 00:21:51,250 And most cloud computing services, like Amazon Web Services, 502 00:21:51,250 --> 00:21:54,880 that offer these load balancing services and autoscaling services, 503 00:21:54,880 --> 00:21:58,010 can allow you to specify here's the minimum number of servers that I want 504 00:21:58,010 --> 00:21:59,380 and here's the maximum number of services 505 00:21:59,380 --> 00:22:02,560 that I want and allow the load balancer to then just make those decisions 506 00:22:02,560 --> 00:22:04,690 about do we need to add another server or not. 507 00:22:04,690 --> 00:22:09,120 And you can add criteria for once we reach 508 00:22:09,120 --> 00:22:10,870 a certain threshold of the number of users 509 00:22:10,870 --> 00:22:13,411 that are trying to access the site, then might be a good time 510 00:22:13,411 --> 00:22:14,390 to increase the scale. 511 00:22:14,390 --> 00:22:19,270 And if after a period of time of high usage and utility of your website 512 00:22:19,270 --> 00:22:22,180 traffic begins to die down and you don't need four servers anymore, 513 00:22:22,180 --> 00:22:23,200 it can scale back down. 514 00:22:23,200 --> 00:22:25,450 It can add new servers when its needed in order 515 00:22:25,450 --> 00:22:28,780 to adjust based on the demand, based on the number of users that 516 00:22:28,780 --> 00:22:30,760 are trying to use your web application. 517 00:22:30,760 --> 00:22:33,550 Your web application can make those decisions about 518 00:22:33,550 --> 00:22:36,220 whether or not we need to increase the number of servers 519 00:22:36,220 --> 00:22:38,530 or decrease the number of servers. 520 00:22:38,530 --> 00:22:42,450 Questions about any of that so far? 521 00:22:42,450 --> 00:22:43,730 Yes. 522 00:22:43,730 --> 00:22:47,570 AUDIENCE: If you use AWS, do they take care of the load balancer? 523 00:22:47,570 --> 00:22:48,440 Do they provide it? 524 00:22:48,440 --> 00:22:51,579 Or is that something that you [INAUDIBLE]?? 525 00:22:51,579 --> 00:22:52,620 BRIAN YU: Great question. 526 00:22:52,620 --> 00:22:55,070 So the question is about how AWS actually does this. 527 00:22:55,070 --> 00:22:58,070 So AWS offers a number of different services. 528 00:22:58,070 --> 00:23:01,070 Amazon Web Services is just one of the more popular cloud computing 529 00:23:01,070 --> 00:23:04,870 services used in order to run servers like these on the internet. 530 00:23:04,870 --> 00:23:07,800 And we'll talk a little bit about that in just a moment, actually. 531 00:23:07,800 --> 00:23:11,690 But one of those services is a service that effectively 532 00:23:11,690 --> 00:23:16,137 will allow you to define this autoscaling group for yourself 533 00:23:16,137 --> 00:23:19,220 in order to say, here are the number of minimum/maximum number of servers. 534 00:23:19,220 --> 00:23:22,520 And Amazon takes care of the process of having a load balancer decide 535 00:23:22,520 --> 00:23:26,070 where to send different users and when to add new servers and when not to. 536 00:23:26,070 --> 00:23:28,880 And other cloud computing providers like Microsoft Azure, 537 00:23:28,880 --> 00:23:31,520 for instance, they all have very similar tools and technologies 538 00:23:31,520 --> 00:23:33,436 that allow you to implement this sort of thing 539 00:23:33,436 --> 00:23:35,820 without you needing to really worry about that. 540 00:23:35,820 --> 00:23:38,900 And that's all part of this new big movement towards cloud computing, 541 00:23:38,900 --> 00:23:41,142 that in the past, when writing a web application 542 00:23:41,142 --> 00:23:43,850 and deploying it and running a web application for your business, 543 00:23:43,850 --> 00:23:46,599 for instance, you might have needed to own the servers yourselves, 544 00:23:46,599 --> 00:23:50,150 physically have the servers inside of your company. 545 00:23:50,150 --> 00:23:52,610 Nowadays, with cloud computing, this is effectively just 546 00:23:52,610 --> 00:23:55,904 a means to allow you to rent computing power stored in the cloud, 547 00:23:55,904 --> 00:23:58,070 stored in someone else's servers, whether it belongs 548 00:23:58,070 --> 00:24:00,890 to Microsoft or Amazon or someone else, and therefore 549 00:24:00,890 --> 00:24:02,420 allow you to use those resources. 550 00:24:02,420 --> 00:24:06,730 And so what might be a benefit of this idea of cloud computing, 551 00:24:06,730 --> 00:24:10,550 of using resources from elsewhere instead of needing to use servers that 552 00:24:10,550 --> 00:24:14,765 are local to wherever you're working? 553 00:24:14,765 --> 00:24:16,965 AUDIENCE: If you're a small shop, then you 554 00:24:16,965 --> 00:24:20,764 don't need to worry about maintaining servers, having IT people. 555 00:24:20,764 --> 00:24:21,430 BRIAN YU: Great. 556 00:24:21,430 --> 00:24:25,420 So from a practical perspective, it's you need-- normally, we need IT people. 557 00:24:25,420 --> 00:24:28,570 You'd have to maintain your own servers, whereas with the cloud system, 558 00:24:28,570 --> 00:24:31,150 it's typically just a rental based on the number of hours 559 00:24:31,150 --> 00:24:32,380 of usage of the server. 560 00:24:32,380 --> 00:24:34,720 And Amazon or Microsoft or whoever takes care 561 00:24:34,720 --> 00:24:36,610 of making sure that the servers are running, 562 00:24:36,610 --> 00:24:39,776 of maintaining the servers, of dealing with any problems that might come up. 563 00:24:39,776 --> 00:24:41,770 And so, certainly, there are practical benefits 564 00:24:41,770 --> 00:24:43,720 that make it logistically more feasible now 565 00:24:43,720 --> 00:24:46,870 to use cloud computing as the means of running a web application 566 00:24:46,870 --> 00:24:50,360 rather than having to physically own your own server. 567 00:24:50,360 --> 00:24:53,080 So now we have this system in place where we're 568 00:24:53,080 --> 00:24:54,610 trying to scale our web application. 569 00:24:54,610 --> 00:24:57,443 We've talked about vertical scaling where we just add more computing 570 00:24:57,443 --> 00:24:58,840 power to one particular server. 571 00:24:58,840 --> 00:25:02,080 And then we spent some time talking about horizontal scaling and the issues 572 00:25:02,080 --> 00:25:06,656 that come into play when suddenly, instead of just having a single server, 573 00:25:06,656 --> 00:25:09,280 we've had to split things up across multiple different servers. 574 00:25:09,280 --> 00:25:12,400 And that added challenges of what do we now do about sessions. 575 00:25:12,400 --> 00:25:15,730 It added challenges of now we need this additional piece of hardware, this load 576 00:25:15,730 --> 00:25:19,300 balancer here, which is making decisions on a frequent basis of where 577 00:25:19,300 --> 00:25:22,180 to send user one, two, three, or four, and then 578 00:25:22,180 --> 00:25:25,510 dealing with how to scale those servers, as we have to potentially 579 00:25:25,510 --> 00:25:28,990 increase or decrease the number of servers that we have depending on load. 580 00:25:28,990 --> 00:25:34,630 What happens now if we have four servers and one of the servers goes offline? 581 00:25:34,630 --> 00:25:36,310 It just stops working. 582 00:25:36,310 --> 00:25:37,557 What could go wrong now? 583 00:25:37,557 --> 00:25:38,890 What might the load balancer do? 584 00:25:38,890 --> 00:25:41,750 585 00:25:41,750 --> 00:25:42,250 Yeah. 586 00:25:42,250 --> 00:25:44,359 AUDIENCE: Getting your server's replacement. 587 00:25:44,359 --> 00:25:46,150 BRIAN YU: So, certainly, the end goal would 588 00:25:46,150 --> 00:25:48,882 be to get this fixed, to try to repair the server, reboot, 589 00:25:48,882 --> 00:25:51,840 restart it, get it back online, or replace the server if really there's 590 00:25:51,840 --> 00:25:53,750 something physically wrong with the server. 591 00:25:53,750 --> 00:25:57,784 But in the meantime, what could go wrong? 592 00:25:57,784 --> 00:26:01,672 AUDIENCE: If a user had an active session with that server, 593 00:26:01,672 --> 00:26:04,184 then they might lose data or something like that. 594 00:26:04,184 --> 00:26:04,850 BRIAN YU: Great. 595 00:26:04,850 --> 00:26:06,620 So one potential problem is that if there 596 00:26:06,620 --> 00:26:09,290 were session data that was stored only on this one server, 597 00:26:09,290 --> 00:26:12,590 now if a user comes back, that session data is 598 00:26:12,590 --> 00:26:14,100 no longer accessible potentially. 599 00:26:14,100 --> 00:26:15,950 And so we talked about possible solutions, 600 00:26:15,950 --> 00:26:18,200 and might deal with that either by storing the session 601 00:26:18,200 --> 00:26:21,116 inside of the client, so it doesn't matter what server they end up at, 602 00:26:21,116 --> 00:26:23,570 or storing the session data inside of a database 603 00:26:23,570 --> 00:26:25,460 somewhere such that it doesn't matter, again, 604 00:26:25,460 --> 00:26:26,835 which server the user is sent to. 605 00:26:26,835 --> 00:26:29,460 They can still retain that information. 606 00:26:29,460 --> 00:26:31,730 But if we have this idea of sticky sessions in 607 00:26:31,730 --> 00:26:36,050 place where user goes to the load balancer, and if they were in session-- 608 00:26:36,050 --> 00:26:38,840 if they were in server A before, they get sent back to server A. 609 00:26:38,840 --> 00:26:40,965 If there were in server B before they get sent back 610 00:26:40,965 --> 00:26:44,320 to server B. What could happen is that a user comes along, 611 00:26:44,320 --> 00:26:46,760 hits the load balancer, and the load balancer says 612 00:26:46,760 --> 00:26:49,490 what server we got last time, and the user was at server B. 613 00:26:49,490 --> 00:26:53,300 And they try to get sent back to this server, but the server is now offline. 614 00:26:53,300 --> 00:26:56,610 So somehow we need a way for our load balancer 615 00:26:56,610 --> 00:26:59,510 to know whether or not these servers are operational or not, 616 00:26:59,510 --> 00:27:03,110 whether or not it makes sense to send the user to one of the servers. 617 00:27:03,110 --> 00:27:05,250 So how might we do that? 618 00:27:05,250 --> 00:27:08,120 How might we solve this problem of we need our load 619 00:27:08,120 --> 00:27:10,180 balancer to know which of the servers are 620 00:27:10,180 --> 00:27:11,900 online so it knows where to send users? 621 00:27:11,900 --> 00:27:13,650 And we definitely don't want to be sending 622 00:27:13,650 --> 00:27:18,645 the user to a server that's no longer running or no longer operational. 623 00:27:18,645 --> 00:27:19,144 Yeah. 624 00:27:19,144 --> 00:27:21,930 AUDIENCE: Just ping the server to see if you get a response. 625 00:27:21,930 --> 00:27:23,310 BRIAN YU: Ping the server, see if you get a response. 626 00:27:23,310 --> 00:27:24,120 Certainly. 627 00:27:24,120 --> 00:27:26,850 And so one variant on this idea that's often 628 00:27:26,850 --> 00:27:29,130 used when it comes towards dealing with these servers 629 00:27:29,130 --> 00:27:33,090 is this idea of having each server give off a heartbeat, just a signal 630 00:27:33,090 --> 00:27:36,220 that they produce every so often, where the signals are 631 00:27:36,220 --> 00:27:37,470 received by the load balancer. 632 00:27:37,470 --> 00:27:39,330 And the load balancer knows if it's hearing those heartbeats, 633 00:27:39,330 --> 00:27:40,950 then the servers are operational. 634 00:27:40,950 --> 00:27:43,770 And if too long goes by without hearing one of those heartbeats 635 00:27:43,770 --> 00:27:46,320 from this server, for instance, then the load balancer 636 00:27:46,320 --> 00:27:49,770 can reasonably guess that maybe that server is no longer operational. 637 00:27:49,770 --> 00:27:53,310 Maybe it should be we should no longer be sending users 638 00:27:53,310 --> 00:27:55,290 to those servers in particular. 639 00:27:55,290 --> 00:27:58,500 And that brings into account its own design decisions of how frequent 640 00:27:58,500 --> 00:28:00,009 do you want those heartbeats to be. 641 00:28:00,009 --> 00:28:01,800 Certainly, if they're more frequent, you're 642 00:28:01,800 --> 00:28:05,560 getting a better sense of the frequency to which the servers are running. 643 00:28:05,560 --> 00:28:09,510 And you know more instantaneously when a server potentially goes offline. 644 00:28:09,510 --> 00:28:12,420 And if those heartbeats are less frequent, then 645 00:28:12,420 --> 00:28:14,670 maybe you're saving on energy because you no longer 646 00:28:14,670 --> 00:28:17,010 need to continuously compute whether or not 647 00:28:17,010 --> 00:28:20,520 you're receiving all of these heartbeats coming from all the different servers. 648 00:28:20,520 --> 00:28:23,980 And so, again, with all of the decisions that we make in scalability, 649 00:28:23,980 --> 00:28:25,950 there's not necessarily one correct decision 650 00:28:25,950 --> 00:28:28,300 that this is the right way to do a load balancer. 651 00:28:28,300 --> 00:28:30,660 But there are trade-offs with each of the decisions 652 00:28:30,660 --> 00:28:32,610 that we make with regards to how many servers 653 00:28:32,610 --> 00:28:36,420 we have, with regards to the algorithm that we choose for our load balancer, 654 00:28:36,420 --> 00:28:38,820 with regards to how we choose to decide whether or not 655 00:28:38,820 --> 00:28:40,860 these servers are offline or not. 656 00:28:40,860 --> 00:28:44,730 And when a server is offline, we need to put some thought into how do we, then, 657 00:28:44,730 --> 00:28:46,500 from the perspective of the load balancer, 658 00:28:46,500 --> 00:28:48,708 decide that we're no longer going to be sending users 659 00:28:48,708 --> 00:28:50,910 to that server that's now offline. 660 00:28:50,910 --> 00:28:53,070 And so all those concerns start to come up 661 00:28:53,070 --> 00:28:58,612 when we start to deal with this idea of trying to scale our web application. 662 00:28:58,612 --> 00:29:00,070 Questions about any of that so far? 663 00:29:00,070 --> 00:29:02,930 664 00:29:02,930 --> 00:29:03,781 OK. 665 00:29:03,781 --> 00:29:05,530 We'll go ahead and take a break right now. 666 00:29:05,530 --> 00:29:07,470 And we'll come back later and talk about some other concerns that 667 00:29:07,470 --> 00:29:10,740 come about with regards to scalability, including talking about databases 668 00:29:10,740 --> 00:29:14,002 and what happens here with this image so far that we only have servers now. 669 00:29:14,002 --> 00:29:16,710 But what happens if we start to integrate databases into the mix? 670 00:29:16,710 --> 00:29:19,110 And how do we deal with scalability there as well? 671 00:29:19,110 --> 00:29:23,710 672 00:29:23,710 --> 00:29:26,070 So before the break, we were talking about how 673 00:29:26,070 --> 00:29:29,190 we would go about scaling applications, either via vertical scaling 674 00:29:29,190 --> 00:29:30,497 or horizontal scaling. 675 00:29:30,497 --> 00:29:32,580 And when we were talking about horizontal scaling, 676 00:29:32,580 --> 00:29:35,760 we talked about this idea of splitting up and rather 677 00:29:35,760 --> 00:29:38,400 than just having one server, having multiple different servers 678 00:29:38,400 --> 00:29:40,800 with a load balancer that can then decide 679 00:29:40,800 --> 00:29:44,820 whether to send the user to server A or whether to send the user to server B. 680 00:29:44,820 --> 00:29:47,700 What we didn't quite talk about was how the load balancer is 681 00:29:47,700 --> 00:29:51,240 able to implement this idea of sticky sessions, the idea of when 682 00:29:51,240 --> 00:29:53,700 the user comes along, if they were at server A last time, 683 00:29:53,700 --> 00:29:56,670 we want to send them back to server A. And if they were at server B last time, 684 00:29:56,670 --> 00:29:58,200 we want to send them to server B. 685 00:29:58,200 --> 00:30:02,950 So what are some ways by which we can actually make that happen? 686 00:30:02,950 --> 00:30:07,080 How can the load balancer consistently send the same user to the same server 687 00:30:07,080 --> 00:30:12,040 every time in order to make sure that if the user's shopping cart's on server A, 688 00:30:12,040 --> 00:30:14,400 that we don't inadvertently send the user to server B 689 00:30:14,400 --> 00:30:18,037 and they lose all the content of their shopping cart data, for instance? 690 00:30:18,037 --> 00:30:19,620 What might be some ways of doing that? 691 00:30:19,620 --> 00:30:24,003 692 00:30:24,003 --> 00:30:26,925 AUDIENCE: Could it have its own session tracking? 693 00:30:26,925 --> 00:30:29,465 It could send the person a cookie or something or-- 694 00:30:29,465 --> 00:30:30,090 BRIAN YU: Good. 695 00:30:30,090 --> 00:30:31,740 It could send the person a cookie, for instance. 696 00:30:31,740 --> 00:30:32,240 Great. 697 00:30:32,240 --> 00:30:35,340 So inside of the cookie, maybe-- the cookie that the load balancer can 698 00:30:35,340 --> 00:30:39,814 set when the response goes back to the user is one that determines 699 00:30:39,814 --> 00:30:41,730 or that has some information inside of it that 700 00:30:41,730 --> 00:30:45,045 says users should go to server A, for instance, or user should go to server 701 00:30:45,045 --> 00:30:49,810 B. And so these cookies are often a very useful way, whether it's for the server 702 00:30:49,810 --> 00:30:52,895 or for the load balancer, of giving information to the client 703 00:30:52,895 --> 00:30:55,020 that when the client tries to make another request, 704 00:30:55,020 --> 00:30:56,460 that information is still there. 705 00:30:56,460 --> 00:30:59,940 And we talked about before the idea that one possible way 706 00:30:59,940 --> 00:31:03,570 of implementing this idea of making sure that the session stays consistent 707 00:31:03,570 --> 00:31:06,840 regardless of what happens with the horizontal scaling 708 00:31:06,840 --> 00:31:10,350 is to actually store session information inside of the cookie. 709 00:31:10,350 --> 00:31:13,150 And this is something that Flask actually does by default. 710 00:31:13,150 --> 00:31:15,000 So we've been using Flask for a while now 711 00:31:15,000 --> 00:31:18,690 in order to write web applications that have sessions that 712 00:31:18,690 --> 00:31:20,460 are storing information about the user. 713 00:31:20,460 --> 00:31:22,350 And by default, Flask will use what's called 714 00:31:22,350 --> 00:31:27,467 a signed cookie, this idea that when the user has their session information, 715 00:31:27,467 --> 00:31:30,300 we're just going to put that session information inside of a cookie. 716 00:31:30,300 --> 00:31:33,600 But what might be the problem of just taking all the session information, 717 00:31:33,600 --> 00:31:36,030 putting it inside of a cookie, and then just using 718 00:31:36,030 --> 00:31:39,150 that as the way via which users are interacting 719 00:31:39,150 --> 00:31:42,330 with sessions on your web application? 720 00:31:42,330 --> 00:31:46,200 Your session, for instance, might just be a Python dictionary, you imagine, 721 00:31:46,200 --> 00:31:48,600 that contains the user's user ID and maybe 722 00:31:48,600 --> 00:31:51,540 the information of what's currently inside that user's shopping cart. 723 00:31:51,540 --> 00:31:53,498 And if that's just inside of a cookie that gets 724 00:31:53,498 --> 00:31:56,927 sent back and forth between the server and the client, 725 00:31:56,927 --> 00:31:58,010 what could go wrong there? 726 00:31:58,010 --> 00:32:05,770 727 00:32:05,770 --> 00:32:08,632 AUDIENCE: Are there limits on how much you can fit in the cookie? 728 00:32:08,632 --> 00:32:10,590 BRIAN YU: So, certainly, the size of the cookie 729 00:32:10,590 --> 00:32:13,712 is something to bear in mind, that a cookie could potentially-- 730 00:32:13,712 --> 00:32:15,420 as the cookies get larger and larger, now 731 00:32:15,420 --> 00:32:18,086 you have to start to worry about cookie, the size of the cookie, 732 00:32:18,086 --> 00:32:21,330 and the amount of latency it'll take to send that consistently back and forth 733 00:32:21,330 --> 00:32:22,860 between the client and the server. 734 00:32:22,860 --> 00:32:24,870 Are there any security concerns we can think 735 00:32:24,870 --> 00:32:28,970 of that could come up if all we're doing is just sending-- 736 00:32:28,970 --> 00:32:32,970 if the server sends back the session information that contains a user ID 737 00:32:32,970 --> 00:32:35,580 and what's inside the cart, and we just expect 738 00:32:35,580 --> 00:32:37,772 that when the user sends back that cookie, 739 00:32:37,772 --> 00:32:39,855 that will be the information that the server knows 740 00:32:39,855 --> 00:32:43,254 is what's contained inside the user's session. 741 00:32:43,254 --> 00:32:45,462 AUDIENCE: I suppose somebody could steal your cookie, 742 00:32:45,462 --> 00:32:48,885 and then they would have access to whatever you have access to 743 00:32:48,885 --> 00:32:50,495 [INAUDIBLE]. 744 00:32:50,495 --> 00:32:51,120 BRIAN YU: Sure. 745 00:32:51,120 --> 00:32:52,530 Certainly, someone could steal the cookie. 746 00:32:52,530 --> 00:32:55,270 And if they were able to steal the cookie and gain access to that cookie, 747 00:32:55,270 --> 00:32:57,395 then they would have access to your entire account. 748 00:32:57,395 --> 00:33:00,660 They could log in as you, and they could see the contents of whatever 749 00:33:00,660 --> 00:33:02,280 was in your cart at that time. 750 00:33:02,280 --> 00:33:05,160 What about even if you didn't have access to someone else's cookie? 751 00:33:05,160 --> 00:33:09,420 Can you imagine a world where in this very simple-- not very secure-- example 752 00:33:09,420 --> 00:33:12,927 where we're just sending the cookie back and forth, where 753 00:33:12,927 --> 00:33:14,760 things could go wrong, where you could still 754 00:33:14,760 --> 00:33:16,301 get access to someone else's account? 755 00:33:16,301 --> 00:33:20,570 756 00:33:20,570 --> 00:33:23,380 So if we're just relying on the contents of the cookie for-- yeah. 757 00:33:23,380 --> 00:33:24,180 Go ahead. 758 00:33:24,180 --> 00:33:27,540 AUDIENCE: Take that cookie and send it yourself so you can pretend to be 759 00:33:27,540 --> 00:33:28,574 [INAUDIBLE]. 760 00:33:28,574 --> 00:33:29,240 BRIAN YU: Great. 761 00:33:29,240 --> 00:33:31,840 So you could try and pretend to be someone else, effectively. 762 00:33:31,840 --> 00:33:35,252 If you were able to take the cookie and change what the value of the user ID 763 00:33:35,252 --> 00:33:36,960 is for instance and try and send it back, 764 00:33:36,960 --> 00:33:38,824 then that potentially is an attack vector 765 00:33:38,824 --> 00:33:41,740 by which you could trick the server into thinking that you are someone 766 00:33:41,740 --> 00:33:42,650 that you're not. 767 00:33:42,650 --> 00:33:45,940 And so one way that Flask tries to get around this is by signing the cookies. 768 00:33:45,940 --> 00:33:48,130 And so if you want to use Flask signed cookies, 769 00:33:48,130 --> 00:33:51,932 you'll have to include a private key inside of the web application, which 770 00:33:51,932 --> 00:33:53,890 is just going to be a long string of characters 771 00:33:53,890 --> 00:33:56,680 that only the web application should know and shouldn't 772 00:33:56,680 --> 00:33:58,130 be accessible to users. 773 00:33:58,130 --> 00:34:01,451 And, effectively, every time Flask sends you a cookie, 774 00:34:01,451 --> 00:34:04,450 it's going to sign that cookie, add a signature, where that signature is 775 00:34:04,450 --> 00:34:09,550 going to be generated based on a combination of the contents 776 00:34:09,550 --> 00:34:13,120 of the session itself and of what the private key is in order 777 00:34:13,120 --> 00:34:16,449 to generate a signature that shouldn't be or should be reasonably-- 778 00:34:16,449 --> 00:34:18,699 should be difficult for anyone to be able to predict 779 00:34:18,699 --> 00:34:22,130 or figure out such that you can know with confidence when you get back 780 00:34:22,130 --> 00:34:24,229 that session, Flask can treat it as a checksum, 781 00:34:24,229 --> 00:34:26,770 effectively, in order to determine, in fact, that this cookie 782 00:34:26,770 --> 00:34:27,850 did come from this user. 783 00:34:27,850 --> 00:34:30,070 It is, in fact, a valid, genuine cookie, and they 784 00:34:30,070 --> 00:34:31,750 can trust the information inside of it. 785 00:34:31,750 --> 00:34:34,250 But, certainly, with the issues we talked about with regards 786 00:34:34,250 --> 00:34:37,287 to cookies and the potential for them to be intercepted and used, 787 00:34:37,287 --> 00:34:39,370 we might not want to use that as our method, which 788 00:34:39,370 --> 00:34:42,453 is why in the Flask applications we've been building, if you've noticed up 789 00:34:42,453 --> 00:34:46,429 at the top, we've set some application settings inside of the Flask app 790 00:34:46,429 --> 00:34:49,780 variable that actually say that when we're using these sessions, rather 791 00:34:49,780 --> 00:34:52,969 than use cookies as their means for storing sessions, 792 00:34:52,969 --> 00:34:55,090 we've been using sessions that are actually 793 00:34:55,090 --> 00:34:57,370 stored on the file system of the server itself 794 00:34:57,370 --> 00:34:59,350 as your way of tracking the sessions. 795 00:34:59,350 --> 00:35:01,600 And, in fact, if you were to ever use sessions 796 00:35:01,600 --> 00:35:04,180 on Flask, using those files system sessions, 797 00:35:04,180 --> 00:35:06,415 and shut off the Flask server or even just-- 798 00:35:06,415 --> 00:35:08,290 even if you didn't shut off the Flask server, 799 00:35:08,290 --> 00:35:12,430 you could look at the contents of the sessions directory 800 00:35:12,430 --> 00:35:15,460 in order to take a look at what the sessions actually look like, 801 00:35:15,460 --> 00:35:18,670 which can be interesting to explore if you want to get a sense for what's 802 00:35:18,670 --> 00:35:21,310 going on inside of the sessions. 803 00:35:21,310 --> 00:35:25,750 And so we spent some time today talking about trying to scale up these servers. 804 00:35:25,750 --> 00:35:28,060 But one thing we've come back to a number of times 805 00:35:28,060 --> 00:35:30,950 is databases and what happens when we're trying to store data, 806 00:35:30,950 --> 00:35:32,290 whether it's session data that we might be 807 00:35:32,290 --> 00:35:33,850 trying to store inside of a database. 808 00:35:33,850 --> 00:35:36,190 Or maybe it's just that our application uses a database, 809 00:35:36,190 --> 00:35:38,230 whether it's in project one or project three, 810 00:35:38,230 --> 00:35:41,560 where we've wanted to store books or food orders inside of a database. 811 00:35:41,560 --> 00:35:44,050 What happens when multiple different servers are 812 00:35:44,050 --> 00:35:46,790 trying to access that same database? 813 00:35:46,790 --> 00:35:50,720 And now we start to get into this issue of trying to scale up our databases. 814 00:35:50,720 --> 00:35:53,180 So we might imagine that-- we'll take the same picture. 815 00:35:53,180 --> 00:35:54,280 We've got a load balancer. 816 00:35:54,280 --> 00:35:55,420 We've got two servers. 817 00:35:55,420 --> 00:35:58,420 And now we also want those servers interacting and communicating 818 00:35:58,420 --> 00:36:00,490 with a database somewhere, where those servers 819 00:36:00,490 --> 00:36:02,680 are communicating with the database. 820 00:36:02,680 --> 00:36:05,660 What can go wrong now in this picture with this model? 821 00:36:05,660 --> 00:36:08,648 822 00:36:08,648 --> 00:36:11,015 AUDIENCE: Well, too much load on your database maybe. 823 00:36:11,015 --> 00:36:11,640 BRIAN YU: Yeah. 824 00:36:11,640 --> 00:36:12,889 Too much load on the database. 825 00:36:12,889 --> 00:36:16,800 You might imagine that if the reason we went from one server to two servers 826 00:36:16,800 --> 00:36:19,590 was because a single server wasn't enough in order 827 00:36:19,590 --> 00:36:22,650 to handle all the load, all the traffic coming into that one server, then 828 00:36:22,650 --> 00:36:25,170 if we have all the load from both servers that are all 829 00:36:25,170 --> 00:36:27,378 trying to talk to the same database at the same time, 830 00:36:27,378 --> 00:36:29,490 we might be potentially overloading that database. 831 00:36:29,490 --> 00:36:30,969 And that might become unmanageable. 832 00:36:30,969 --> 00:36:32,010 What else could go wrong? 833 00:36:32,010 --> 00:36:35,475 834 00:36:35,475 --> 00:36:37,460 AUDIENCE: Database server could go down. 835 00:36:37,460 --> 00:36:38,660 BRIAN YU: The database server could go down. 836 00:36:38,660 --> 00:36:38,900 Great. 837 00:36:38,900 --> 00:36:40,260 That's another thing that could happen. 838 00:36:40,260 --> 00:36:43,370 So we've talked about this idea that when we were scaling our servers 839 00:36:43,370 --> 00:36:46,886 and had multiple different servers, if one server goes down, no big deal. 840 00:36:46,886 --> 00:36:49,760 So long as the load balancer knows that server two is the server that 841 00:36:49,760 --> 00:36:51,590 went down, it can redirect all the traffic 842 00:36:51,590 --> 00:36:53,540 to servers one, three, four, and five. 843 00:36:53,540 --> 00:36:57,050 But here we see that this is what we might call a single point of failure, 844 00:36:57,050 --> 00:37:00,770 a place inside of our diagram of all the hardware that's 845 00:37:00,770 --> 00:37:04,141 going on where if this one thing fails, then the entire web application breaks. 846 00:37:04,141 --> 00:37:04,640 Right? 847 00:37:04,640 --> 00:37:07,250 If the database fails, nothing else in the application 848 00:37:07,250 --> 00:37:10,250 is going to be able to work, assuming the web application is relying 849 00:37:10,250 --> 00:37:12,170 on the data and the database to work. 850 00:37:12,170 --> 00:37:15,480 Whereas this server, for instance, wouldn't be a single point of failure 851 00:37:15,480 --> 00:37:18,440 because if this server goes down, then the load balancer 852 00:37:18,440 --> 00:37:20,150 can just direct all users and all traffic 853 00:37:20,150 --> 00:37:23,552 to this server over here, which can still access the database. 854 00:37:23,552 --> 00:37:25,760 And for that matter, the load balancer itself is also 855 00:37:25,760 --> 00:37:27,093 another single point of failure. 856 00:37:27,093 --> 00:37:29,511 If the load balancer goes down, then suddenly we 857 00:37:29,511 --> 00:37:32,010 have no way of directing users to various different servers. 858 00:37:32,010 --> 00:37:33,410 And so we might think that there might be ways 859 00:37:33,410 --> 00:37:36,330 that we want to have multiple load balancers, for instance, 860 00:37:36,330 --> 00:37:38,690 in order to try to address that problem of avoiding 861 00:37:38,690 --> 00:37:40,320 having single points of failure. 862 00:37:40,320 --> 00:37:42,890 But what we're going to focus on now is on ways 863 00:37:42,890 --> 00:37:45,430 to make this database scaling more manageable. 864 00:37:45,430 --> 00:37:47,930 That as more and more data starts to come into our database, 865 00:37:47,930 --> 00:37:50,930 we might start to see slower queries because if we have millions 866 00:37:50,930 --> 00:37:52,720 upon millions of rows in our database, it 867 00:37:52,720 --> 00:37:55,595 might take longer and longer in order to query that database in order 868 00:37:55,595 --> 00:37:56,970 to get the data that we want. 869 00:37:56,970 --> 00:38:01,350 So how do, as we scale up our applications, begin to deal with that? 870 00:38:01,350 --> 00:38:03,350 And so the first topic we're going to talk about 871 00:38:03,350 --> 00:38:08,000 is database partitioning, the idea that if we have database tables that 872 00:38:08,000 --> 00:38:11,540 are large, either large in a number of rows or large in a number of columns, 873 00:38:11,540 --> 00:38:14,810 then trying to query information from those big tables 874 00:38:14,810 --> 00:38:16,940 can start to get complicated. 875 00:38:16,940 --> 00:38:18,770 And it can start to become time-consuming. 876 00:38:18,770 --> 00:38:22,130 That if we have large tables, it's going to take more and more time in order 877 00:38:22,130 --> 00:38:23,000 to query them. 878 00:38:23,000 --> 00:38:26,630 And so databased partitioning is going to represent the idea 879 00:38:26,630 --> 00:38:28,850 that if we have data inside of our database, 880 00:38:28,850 --> 00:38:32,960 we can often split up that data into multiple different parts-- 881 00:38:32,960 --> 00:38:35,540 into multiple different tables, for instance-- in order 882 00:38:35,540 --> 00:38:38,690 to better allow for ourselves to deal with more manageable units, 883 00:38:38,690 --> 00:38:41,990 to have queries on those tables run more efficiently and more quickly, 884 00:38:41,990 --> 00:38:46,170 and in that sense help us as we begin to scale up our web application. 885 00:38:46,170 --> 00:38:49,890 And so one form of database partitioning we've actually already seen. 886 00:38:49,890 --> 00:38:52,200 It's called vertical database partitioning. 887 00:38:52,200 --> 00:38:54,200 And the idea of vertical database partitioning-- 888 00:38:54,200 --> 00:38:56,480 if you remember this from way back in one of the earlier weeks 889 00:38:56,480 --> 00:38:57,320 of the lecture-- 890 00:38:57,320 --> 00:38:59,960 is the idea that in vertical database partitioning 891 00:38:59,960 --> 00:39:03,980 we're going to separate our table into multiple different tables 892 00:39:03,980 --> 00:39:06,530 by decreasing the number of columns in those tables 893 00:39:06,530 --> 00:39:09,410 by separating things out such that some columns are 894 00:39:09,410 --> 00:39:12,660 going to be put into a different table than other columns. 895 00:39:12,660 --> 00:39:14,895 So if we recall, this original table of flights, 896 00:39:14,895 --> 00:39:18,020 which we were keeping track of when we were first trying to think about SQL 897 00:39:18,020 --> 00:39:20,330 and how we might organize data inside of a database, 898 00:39:20,330 --> 00:39:25,327 we have each flight having an ID number, an origin, an origin airline code, 899 00:39:25,327 --> 00:39:28,160 a destination, the destination code, and the duration of the flight, 900 00:39:28,160 --> 00:39:29,540 for instance. 901 00:39:29,540 --> 00:39:31,610 And what we did when we were first talking 902 00:39:31,610 --> 00:39:33,740 about SQL and the idea of designing tables 903 00:39:33,740 --> 00:39:38,030 was to use foreign keys as our way of what we'll now call vertically 904 00:39:38,030 --> 00:39:39,540 partitioning this database. 905 00:39:39,540 --> 00:39:42,920 And instead of storing all the data like this inside of a big flights 906 00:39:42,920 --> 00:39:44,830 table that might be expensive to query, we 907 00:39:44,830 --> 00:39:46,580 can split it up into two different tables. 908 00:39:46,580 --> 00:39:49,790 Split it up into a locations table where each location just 909 00:39:49,790 --> 00:39:52,490 has its independent code and the name. 910 00:39:52,490 --> 00:39:55,580 And we can split it up into a flights table where each flight, 911 00:39:55,580 --> 00:39:59,420 rather than have all of those columns as before, now only has four columns. 912 00:39:59,420 --> 00:40:00,770 It's got an ID column. 913 00:40:00,770 --> 00:40:04,760 It's got a number that represents the origin, or inside of the locations 914 00:40:04,760 --> 00:40:08,750 table, origin one corresponds to location number one in the locations 915 00:40:08,750 --> 00:40:09,294 table. 916 00:40:09,294 --> 00:40:11,210 And, likewise, we have a destination_id, where 917 00:40:11,210 --> 00:40:16,140 destination_id four corresponds to this particular location in London, 918 00:40:16,140 --> 00:40:17,360 and, finally, a duration. 919 00:40:17,360 --> 00:40:19,550 And so we factored out some of the columns 920 00:40:19,550 --> 00:40:22,160 in order to create tables that have fewer columns and are, 921 00:40:22,160 --> 00:40:26,424 therefore, more manageable and might be easier to query in some sense. 922 00:40:26,424 --> 00:40:28,340 And so this is vertical partitioning, and it's 923 00:40:28,340 --> 00:40:29,944 something we've already seen before. 924 00:40:29,944 --> 00:40:33,110 But there's another form of partitioning as well, which is, you might guess, 925 00:40:33,110 --> 00:40:35,030 is called horizontal partitioning. 926 00:40:35,030 --> 00:40:37,340 And that might look something like this. 927 00:40:37,340 --> 00:40:41,354 If our table of flight is just getting too long, getting too big to query, 928 00:40:41,354 --> 00:40:44,270 where we're consistently having to run queries on the set of flights-- 929 00:40:44,270 --> 00:40:48,780 looking for all flights that are going from New York to San Francisco, 930 00:40:48,780 --> 00:40:49,830 for example-- 931 00:40:49,830 --> 00:40:53,030 and those queries are starting to take a long time, 932 00:40:53,030 --> 00:40:56,130 we might horizontally partition our table. 933 00:40:56,130 --> 00:40:59,240 In horizontal partitioning, rather than change the number of rows 934 00:40:59,240 --> 00:41:01,200 to have fewer rows in each of our tables, 935 00:41:01,200 --> 00:41:03,820 we're just going to split up the rows of our tables. 936 00:41:03,820 --> 00:41:06,357 Or rather than change the number of columns in the tables, 937 00:41:06,357 --> 00:41:08,690 we're going to split up the rows of our table such that, 938 00:41:08,690 --> 00:41:13,362 rather than have a table that has 2,000 rows, for instance, 939 00:41:13,362 --> 00:41:16,070 we might have two different tables where we put 1,000 rows in one 940 00:41:16,070 --> 00:41:18,450 table and 1,000 rows in another table. 941 00:41:18,450 --> 00:41:21,230 And so we might take this idea of the flights table 942 00:41:21,230 --> 00:41:24,170 and really split it up into two different tables, a domestic flights 943 00:41:24,170 --> 00:41:27,560 table and an international flights table, where each one of these tables 944 00:41:27,560 --> 00:41:28,970 contains the same columns. 945 00:41:28,970 --> 00:41:32,840 It's still going to have an ID, an origin, a destination, and a duration. 946 00:41:32,840 --> 00:41:35,890 It's just that we're going to split up the flights into those two 947 00:41:35,890 --> 00:41:37,300 independent tables. 948 00:41:37,300 --> 00:41:39,580 What benefit do we get by doing this? 949 00:41:39,580 --> 00:41:42,010 What advantage do we get by taking the flights table 950 00:41:42,010 --> 00:41:44,740 and partitioning it into two different tables, a domestic 951 00:41:44,740 --> 00:41:48,640 and an international table, that we didn't have with just the flights 952 00:41:48,640 --> 00:41:50,835 table? 953 00:41:50,835 --> 00:41:51,335 Yeah. 954 00:41:51,335 --> 00:41:54,730 AUDIENCE: You're going through less rows, so if you split it, the table, 955 00:41:54,730 --> 00:41:59,684 in half, you're spending half the time [INAUDIBLE] to the database. 956 00:41:59,684 --> 00:42:00,350 BRIAN YU: Great. 957 00:42:00,350 --> 00:42:02,570 So the big benefit is that our queries are faster, 958 00:42:02,570 --> 00:42:05,210 that if I'm trying to query a domestic flight, 959 00:42:05,210 --> 00:42:08,060 I now only need to search through this domestic flights table. 960 00:42:08,060 --> 00:42:09,290 And I don't need to worry about searching 961 00:42:09,290 --> 00:42:11,790 through however many international flights there might be, 962 00:42:11,790 --> 00:42:13,460 and so my queries can become faster. 963 00:42:13,460 --> 00:42:16,372 And horizontal partitioning oftentimes isn't just with two tables. 964 00:42:16,372 --> 00:42:18,830 You might split things up into many, many different tables. 965 00:42:18,830 --> 00:42:23,690 You might imagine that if you have a database that's 966 00:42:23,690 --> 00:42:26,854 keeping track of different people's addresses and locations inside 967 00:42:26,854 --> 00:42:28,770 of the country, that you might split things up 968 00:42:28,770 --> 00:42:31,437 into having 50 different tables-- one for each of the US states, 969 00:42:31,437 --> 00:42:34,269 where if you're trying to find someone where you know that they live 970 00:42:34,269 --> 00:42:36,500 in Oregon, for example, you can just query that table 971 00:42:36,500 --> 00:42:38,791 and ignore the tables that have to do with anyone else, 972 00:42:38,791 --> 00:42:40,850 thereby speeding up that query. 973 00:42:40,850 --> 00:42:44,660 What drawbacks, though, come with this approach of horizontally partitioning 974 00:42:44,660 --> 00:42:47,840 our data into multiple different tables rather than keeping it 975 00:42:47,840 --> 00:42:51,330 all inside of the same table? 976 00:42:51,330 --> 00:42:51,830 Yeah. 977 00:42:51,830 --> 00:42:53,450 AUDIENCE: It seems like your code would have 978 00:42:53,450 --> 00:42:55,449 to get more complicated because you have to know 979 00:42:55,449 --> 00:42:56,835 which table to look in for what. 980 00:42:56,835 --> 00:42:57,460 BRIAN YU: Sure. 981 00:42:57,460 --> 00:42:59,168 So there's some code complexity that gets 982 00:42:59,168 --> 00:43:01,510 added here, that we need to now know before we query. 983 00:43:01,510 --> 00:43:03,400 We can't just say query the flights table. 984 00:43:03,400 --> 00:43:05,050 We need to have some mechanism for knowing, yeah, 985 00:43:05,050 --> 00:43:06,940 should we query the domestic flights table, 986 00:43:06,940 --> 00:43:08,890 or should we query the international flights table. 987 00:43:08,890 --> 00:43:11,681 And maybe that in itself is going to be an expensive process, which 988 00:43:11,681 --> 00:43:13,660 is why oftentimes it's good, if you're going 989 00:43:13,660 --> 00:43:16,240 to do any sort of horizontal database partitioning, 990 00:43:16,240 --> 00:43:19,752 to give some thought as to how your partitioning that data, making sure 991 00:43:19,752 --> 00:43:22,960 that it's a way that you're going to be able to quickly and easily figure out 992 00:43:22,960 --> 00:43:25,180 this is the table that I need to query as opposed 993 00:43:25,180 --> 00:43:28,138 to having to spend a long time trying to figure out which of the tables 994 00:43:28,138 --> 00:43:30,340 to query before you actually do. 995 00:43:30,340 --> 00:43:32,630 Other potential drawbacks of this approach? 996 00:43:32,630 --> 00:43:35,994 AUDIENCE: If you make a schema change to one, you have to do it to the others. 997 00:43:35,994 --> 00:43:36,660 BRIAN YU: Great. 998 00:43:36,660 --> 00:43:39,035 So schema changes now become a little more of a headache, 999 00:43:39,035 --> 00:43:41,582 that if I'm changing the schema for this flights table, 1000 00:43:41,582 --> 00:43:43,290 now I suddenly need to worry about-- now, 1001 00:43:43,290 --> 00:43:46,331 instead of just changing one table, I need to update both tables in order 1002 00:43:46,331 --> 00:43:48,772 to reflect those changes. 1003 00:43:48,772 --> 00:43:51,480 Other things that could go wrong with this approach or trade-offs 1004 00:43:51,480 --> 00:43:53,850 that we have to sacrifice in order to get 1005 00:43:53,850 --> 00:43:57,260 the benefit of this additional query speed? 1006 00:43:57,260 --> 00:43:58,490 We'll put it this way-- yeah. 1007 00:43:58,490 --> 00:43:59,365 Go ahead. 1008 00:43:59,365 --> 00:44:00,490 AUDIENCE: Maybe validation. 1009 00:44:00,490 --> 00:44:03,564 You might have duplicates and multiples tables. 1010 00:44:03,564 --> 00:44:04,230 BRIAN YU: Great. 1011 00:44:04,230 --> 00:44:06,870 So as soon as we start to deal with multiple tables, then 1012 00:44:06,870 --> 00:44:09,150 there's potential for invalid data that you 1013 00:44:09,150 --> 00:44:12,240 might need to worry about making sure that the tables are matching up. 1014 00:44:12,240 --> 00:44:14,730 You don't want there to be a domestic flight in the international flights 1015 00:44:14,730 --> 00:44:15,630 table, for instance. 1016 00:44:15,630 --> 00:44:18,255 And these are the things that you have to start to worry about. 1017 00:44:18,255 --> 00:44:21,840 When might a query actually be slower in this approach as opposed 1018 00:44:21,840 --> 00:44:25,470 to this approach with just the flights table? 1019 00:44:25,470 --> 00:44:28,356 AUDIENCE: You need to bring all that data back together again somehow 1020 00:44:28,356 --> 00:44:29,835 if you want to process [INAUDIBLE]. 1021 00:44:29,835 --> 00:44:30,460 BRIAN YU: Yeah. 1022 00:44:30,460 --> 00:44:30,960 Sure. 1023 00:44:30,960 --> 00:44:34,076 Any time you would need to bring data from all these tables together, 1024 00:44:34,076 --> 00:44:35,950 now your query is actually going to be slower 1025 00:44:35,950 --> 00:44:38,380 because we have to first query this table and then in a separate query, 1026 00:44:38,380 --> 00:44:39,680 query the other table. 1027 00:44:39,680 --> 00:44:43,510 So you might imagine if I wanted a listing of all of the flights leaving 1028 00:44:43,510 --> 00:44:45,970 New York City airport, a New York City airport, 1029 00:44:45,970 --> 00:44:48,430 then suddenly I need to worry about not just 1030 00:44:48,430 --> 00:44:50,800 the domestic flights that are leaving but also the international flights that 1031 00:44:50,800 --> 00:44:51,300 are leaving. 1032 00:44:51,300 --> 00:44:53,800 I might need to query both of those tables independently 1033 00:44:53,800 --> 00:44:57,400 in order to get the information that I can then display. 1034 00:44:57,400 --> 00:45:00,670 And that might take longer by querying two tables than I could with just one. 1035 00:45:00,670 --> 00:45:03,910 So oftentimes, when horizontally partitioning data, 1036 00:45:03,910 --> 00:45:06,832 it's a good idea to think about how you're partitioning things 1037 00:45:06,832 --> 00:45:09,040 in such a way that you don't want to partition things 1038 00:45:09,040 --> 00:45:12,100 in ways where you'll often need information 1039 00:45:12,100 --> 00:45:13,604 that's in different partitions. 1040 00:45:13,604 --> 00:45:15,520 You'll often want to partition things in a way 1041 00:45:15,520 --> 00:45:17,560 such that, with relative frequency, you'll 1042 00:45:17,560 --> 00:45:19,630 only be querying for things from a single one 1043 00:45:19,630 --> 00:45:21,394 of those individual horizontal partitions. 1044 00:45:21,394 --> 00:45:23,560 So there's some design thinking and design decisions 1045 00:45:23,560 --> 00:45:28,660 that have to go into play as you think about which one of the partitions 1046 00:45:28,660 --> 00:45:32,360 to look for and how you're going to actually partition that data. 1047 00:45:32,360 --> 00:45:35,230 Another term you might hear here with regards to scaling databases 1048 00:45:35,230 --> 00:45:38,440 is database sharding, the idea of that right now, 1049 00:45:38,440 --> 00:45:40,900 rather than take a single table and split it up 1050 00:45:40,900 --> 00:45:42,921 into two tables in the same database server, 1051 00:45:42,921 --> 00:45:44,920 we might actually have multiple database servers 1052 00:45:44,920 --> 00:45:47,460 where I store domestic flights on one database server 1053 00:45:47,460 --> 00:45:50,080 and international flights on another database server. 1054 00:45:50,080 --> 00:45:51,430 What might be a benefit of that? 1055 00:45:51,430 --> 00:45:54,310 1056 00:45:54,310 --> 00:45:58,012 Where I have two independent servers, one of which contains some of the data, 1057 00:45:58,012 --> 00:46:01,220 the domestic flights, and one of which contains the international flights, as 1058 00:46:01,220 --> 00:46:03,761 opposed to having them in just two tables on the same server. 1059 00:46:03,761 --> 00:46:06,488 1060 00:46:06,488 --> 00:46:08,860 AUDIENCE: It's not a single point of failure anymore. 1061 00:46:08,860 --> 00:46:10,720 BRIAN YU: Not a single point of failure, that if I 1062 00:46:10,720 --> 00:46:13,261 happen to-- if the international database happens to go down, 1063 00:46:13,261 --> 00:46:16,060 I still have access to the domestic flights. 1064 00:46:16,060 --> 00:46:19,450 What about that example that I gave before of I 1065 00:46:19,450 --> 00:46:24,467 want to get all of the flights that are leaving San Francisco airport? 1066 00:46:24,467 --> 00:46:25,092 AUDIENCE: Yeah. 1067 00:46:25,092 --> 00:46:27,044 So maybe you could process the data faster 1068 00:46:27,044 --> 00:46:30,334 because each server's going to process its own table [INAUDIBLE].. 1069 00:46:30,334 --> 00:46:31,000 BRIAN YU: Great. 1070 00:46:31,000 --> 00:46:33,030 Now I can have some concurrency, that I can-- 1071 00:46:33,030 --> 00:46:34,919 query for both the database servers and say, 1072 00:46:34,919 --> 00:46:37,210 give me all the flights that are leaving San Francisco. 1073 00:46:37,210 --> 00:46:39,820 And I can have the domestic server running and the international server 1074 00:46:39,820 --> 00:46:42,580 running simultaneously and then giving me back those results. 1075 00:46:42,580 --> 00:46:45,400 And so maybe that will help to offset what might initially 1076 00:46:45,400 --> 00:46:48,430 have been a longer query in order to query these two separate tables. 1077 00:46:48,430 --> 00:46:52,150 But, of course, it's still going to mean now that I have to deal with the fact 1078 00:46:52,150 --> 00:46:54,500 that my data is located in different places. 1079 00:46:54,500 --> 00:46:57,400 And if I ever want to do a SQL join, for instance, if I'm 1080 00:46:57,400 --> 00:47:00,100 trying to join multiple tables together, now the fact 1081 00:47:00,100 --> 00:47:02,200 that the tables are located on different servers, 1082 00:47:02,200 --> 00:47:04,190 that's going to come at a time cost as well. 1083 00:47:04,190 --> 00:47:07,880 And so as we think about database design and on which servers 1084 00:47:07,880 --> 00:47:11,052 your table should go, all of these are things that should come into play. 1085 00:47:11,052 --> 00:47:12,760 And they're considerations that are going 1086 00:47:12,760 --> 00:47:15,343 to change depending on the specific needs of your application, 1087 00:47:15,343 --> 00:47:17,680 depending on how frequently you're going to be accessing 1088 00:47:17,680 --> 00:47:19,570 one type of data as opposed to another. 1089 00:47:19,570 --> 00:47:21,486 There are trade-offs to think about and things 1090 00:47:21,486 --> 00:47:24,760 that you'll have to weigh as you go about making those decisions. 1091 00:47:24,760 --> 00:47:28,150 Questions about anything with regards to database partitioning, 1092 00:47:28,150 --> 00:47:31,550 splitting data up? 1093 00:47:31,550 --> 00:47:32,050 All right. 1094 00:47:32,050 --> 00:47:33,800 So databased partitioning, splitting data, 1095 00:47:33,800 --> 00:47:36,940 may help to make data more manageable, and it may help to speed up queries. 1096 00:47:36,940 --> 00:47:40,750 But it doesn't fully solve that single point of failure problem, the problem 1097 00:47:40,750 --> 00:47:43,180 of we have two servers that are both trying 1098 00:47:43,180 --> 00:47:45,400 to talk to the same database that has all the data. 1099 00:47:45,400 --> 00:47:47,680 Maybe I've partitioned the data to make our queries faster, 1100 00:47:47,680 --> 00:47:50,388 and so maybe our database can start to handle more and more users 1101 00:47:50,388 --> 00:47:51,307 than it could before. 1102 00:47:51,307 --> 00:47:53,140 But we're still dealing with the possibility 1103 00:47:53,140 --> 00:47:56,200 now that we have a single point of failure where that database can fail, 1104 00:47:56,200 --> 00:47:57,850 and suddenly nothing's going to work. 1105 00:47:57,850 --> 00:48:00,130 And we're still still dealing with the possibility 1106 00:48:00,130 --> 00:48:03,340 that we might overload the database if we have 10 different servers that are 1107 00:48:03,340 --> 00:48:05,420 all trying to access the same database. 1108 00:48:05,420 --> 00:48:08,354 So what might we do now? 1109 00:48:08,354 --> 00:48:11,345 AUDIENCE: Wouldn't there be a database backup system somewhere? 1110 00:48:11,345 --> 00:48:11,970 BRIAN YU: Sure. 1111 00:48:11,970 --> 00:48:14,190 A database backup system would be a great idea. 1112 00:48:14,190 --> 00:48:16,856 And we'll often call this database replication, the idea that we 1113 00:48:16,856 --> 00:48:18,450 don't just want one copy of our data. 1114 00:48:18,450 --> 00:48:20,700 Maybe we want multiple copies of our data, two or even 1115 00:48:20,700 --> 00:48:22,950 three different copies of the same database 1116 00:48:22,950 --> 00:48:26,070 that we can, therefore, help to distribute load across. 1117 00:48:26,070 --> 00:48:29,340 In the same way that we could distribute load between different servers, 1118 00:48:29,340 --> 00:48:31,942 we can distribute load between the databases as well. 1119 00:48:31,942 --> 00:48:34,900 What problems start to come up now that we're duplicating our database? 1120 00:48:34,900 --> 00:48:37,900 Now we have three different copies of the database. 1121 00:48:37,900 --> 00:48:39,994 How do we deal with it? 1122 00:48:39,994 --> 00:48:44,470 AUDIENCE: You're going to get some servers' data-- 1123 00:48:44,470 --> 00:48:45,970 or database, yeah, they'll match up. 1124 00:48:45,970 --> 00:48:49,954 But if you have three databases, one database 1125 00:48:49,954 --> 00:48:51,944 might have more recent data than the other one. 1126 00:48:51,944 --> 00:48:52,610 BRIAN YU: Great. 1127 00:48:52,610 --> 00:48:54,318 So now we're dealing with the possibility 1128 00:48:54,318 --> 00:48:56,760 that server data might not match up with each other. 1129 00:48:56,760 --> 00:49:00,140 If I have three different databases, what happens if I update one database? 1130 00:49:00,140 --> 00:49:02,620 What happens to the other two databases, for instance? 1131 00:49:02,620 --> 00:49:04,000 What happens to that data? 1132 00:49:04,000 --> 00:49:07,826 So how might we resolve those sorts of problems? 1133 00:49:07,826 --> 00:49:10,700 That problem of we need to make sure that our three databases are all 1134 00:49:10,700 --> 00:49:12,810 in sync with each other. 1135 00:49:12,810 --> 00:49:14,897 And we want to have one database have some data 1136 00:49:14,897 --> 00:49:16,730 and have another database not have that data 1137 00:49:16,730 --> 00:49:19,730 because then it will change the user's experience depending on which the 1138 00:49:19,730 --> 00:49:22,339 databases they happen to try to access. 1139 00:49:22,339 --> 00:49:24,630 AUDIENCE: I mean, it seem like the database servers are 1140 00:49:24,630 --> 00:49:27,458 going to have to speak to each other, lock records, update records, 1141 00:49:27,458 --> 00:49:31,949 so do the same thing that was done to them [INAUDIBLE] 1142 00:49:31,949 --> 00:49:33,450 the other database servers. 1143 00:49:33,450 --> 00:49:33,700 BRIAN YU: Great. 1144 00:49:33,700 --> 00:49:36,790 So there's going to need to be some sort of communication between the servers. 1145 00:49:36,790 --> 00:49:38,800 And so we'll look at a couple of different models 1146 00:49:38,800 --> 00:49:41,091 for database replication that are quite common in order 1147 00:49:41,091 --> 00:49:42,670 to try and deal with these problems. 1148 00:49:42,670 --> 00:49:44,840 And so we'll look at a single-primary replication 1149 00:49:44,840 --> 00:49:46,580 and multi-primary replication. 1150 00:49:46,580 --> 00:49:49,360 And so in the single-primary replication model 1151 00:49:49,360 --> 00:49:52,530 for database replication, what we have is 1152 00:49:52,530 --> 00:49:54,280 we have a single, as the name might imply, 1153 00:49:54,280 --> 00:49:56,634 a single database, which is called our primary database, 1154 00:49:56,634 --> 00:49:58,300 which would be this database right here. 1155 00:49:58,300 --> 00:50:01,409 And on this primary database, you could treat it like the single database 1156 00:50:01,409 --> 00:50:02,200 that we had before. 1157 00:50:02,200 --> 00:50:05,630 You can read data from it, and you can write data to it. 1158 00:50:05,630 --> 00:50:08,200 And we also have these two databases over here, which we're 1159 00:50:08,200 --> 00:50:10,030 going to call secondary databases. 1160 00:50:10,030 --> 00:50:13,000 And the idea of the secondary databases is that you can only 1161 00:50:13,000 --> 00:50:15,100 read data from a secondary database. 1162 00:50:15,100 --> 00:50:17,200 You can never update to the secondary database 1163 00:50:17,200 --> 00:50:19,030 or write to the secondary database. 1164 00:50:19,030 --> 00:50:22,240 You can only ever write, meaning update or add a row 1165 00:50:22,240 --> 00:50:25,000 or delete a row, to the primary database. 1166 00:50:25,000 --> 00:50:27,930 And you can select all the data you want from the other databases, 1167 00:50:27,930 --> 00:50:31,760 but you can't update or add or delete new rules. 1168 00:50:31,760 --> 00:50:33,650 So what's missing from this picture now? 1169 00:50:33,650 --> 00:50:37,721 What needs to happen any time a write to this database happens? 1170 00:50:37,721 --> 00:50:40,744 AUDIENCE: [INAUDIBLE] 1171 00:50:40,744 --> 00:50:41,410 BRIAN YU: Great. 1172 00:50:41,410 --> 00:50:44,530 We need this update mechanism, that whenever we write to this database 1173 00:50:44,530 --> 00:50:48,280 here, our primary database, our primary database needs to update this database 1174 00:50:48,280 --> 00:50:52,060 and update this database, tell the secondary databases that new data has 1175 00:50:52,060 --> 00:50:55,030 been added or removed or updated and changed in some way 1176 00:50:55,030 --> 00:50:58,370 in order to make sure that those new databases reflect those changes. 1177 00:50:58,370 --> 00:51:03,010 And so under this model, we're able to implement 1178 00:51:03,010 --> 00:51:06,580 this idea of replicating the databases and making sure the databases that 1179 00:51:06,580 --> 00:51:09,970 are staying in sync because we're only ever able to make changes 1180 00:51:09,970 --> 00:51:11,500 on this database over here. 1181 00:51:11,500 --> 00:51:14,860 And when we do, that database is going to update the secondary databases. 1182 00:51:14,860 --> 00:51:19,647 It's going to make sure that those databases are aware of those changes. 1183 00:51:19,647 --> 00:51:21,230 What are some drawbacks of this model? 1184 00:51:21,230 --> 00:51:24,990 1185 00:51:24,990 --> 00:51:26,644 AUDIENCE: Timing. 1186 00:51:26,644 --> 00:51:27,810 BRIAN YU: Timing, certainly. 1187 00:51:27,810 --> 00:51:29,643 We might deal with potential race conditions 1188 00:51:29,643 --> 00:51:33,020 here, that if we write some data to this database 1189 00:51:33,020 --> 00:51:36,007 and we try and read it from some other database before there 1190 00:51:36,007 --> 00:51:38,090 has been time in order to make that update happen, 1191 00:51:38,090 --> 00:51:40,084 that can potentially cause problems. 1192 00:51:40,084 --> 00:51:43,048 AUDIENCE: It seems like in general reading is going to be pretty good. 1193 00:51:43,048 --> 00:51:47,000 But writing is what's going to really take a hit, especially 1194 00:51:47,000 --> 00:51:52,928 because that server that you're writing to has to use its resources to write 1195 00:51:52,928 --> 00:51:54,104 in the other databases. 1196 00:51:54,104 --> 00:51:54,770 BRIAN YU: Great. 1197 00:51:54,770 --> 00:51:58,280 So this model seems pretty good if we're doing 1198 00:51:58,280 --> 00:51:59,780 a lot of reading from our database. 1199 00:51:59,780 --> 00:52:02,180 So you might imagine that, depending upon the web application, 1200 00:52:02,180 --> 00:52:03,596 this might be a really good model. 1201 00:52:03,596 --> 00:52:10,550 If you imagine a model like a blog or a news website, where most of the time, 1202 00:52:10,550 --> 00:52:12,320 when someone's accessing the news website, 1203 00:52:12,320 --> 00:52:15,500 they're just reading the stories that are published on the news website. 1204 00:52:15,500 --> 00:52:19,170 And it's not like they're adding new stories constantly to the-- 1205 00:52:19,170 --> 00:52:21,550 there's fewer times that people are adding stories 1206 00:52:21,550 --> 00:52:23,758 than they are reading stories, so reads are happening 1207 00:52:23,758 --> 00:52:25,200 a lot more frequently than writes. 1208 00:52:25,200 --> 00:52:28,280 Then this might be a perfectly acceptable model, where we just 1209 00:52:28,280 --> 00:52:30,852 have a lot of databases that are reading but only 1210 00:52:30,852 --> 00:52:33,810 one database where we can actually write to because that's less common. 1211 00:52:33,810 --> 00:52:37,440 But if writes were more common, this model starts to look not as good. 1212 00:52:37,440 --> 00:52:40,830 We still have a single point of failure, that if this database breaks down, 1213 00:52:40,830 --> 00:52:43,700 now suddenly we're not able to do writes to the database. 1214 00:52:43,700 --> 00:52:47,460 And if a lot of people are trying to write to the database at the same time, 1215 00:52:47,460 --> 00:52:51,290 we're not able to distribute that load because the only place where we can do 1216 00:52:51,290 --> 00:52:56,490 writes is on this primary database over here and not on the secondary database. 1217 00:52:56,490 --> 00:53:00,830 And so a solution to that, rather than just using single-primary replication, 1218 00:53:00,830 --> 00:53:04,796 will be multi-primary replication, which, as the name might suggest, 1219 00:53:04,796 --> 00:53:07,670 is where, instead of having a single primary database and some number 1220 00:53:07,670 --> 00:53:11,900 of secondary databases, we have multiple different primary databases where 1221 00:53:11,900 --> 00:53:14,610 for each one you can read and write data to each of them. 1222 00:53:14,610 --> 00:53:16,970 So now it's not just reads that can be distributed 1223 00:53:16,970 --> 00:53:18,530 across a number of different servers. 1224 00:53:18,530 --> 00:53:19,370 It's writes as well. 1225 00:53:19,370 --> 00:53:21,350 We can add rows, delete rows, update rows 1226 00:53:21,350 --> 00:53:26,220 across any of the servers in this multi-primary replication model. 1227 00:53:26,220 --> 00:53:27,350 So what's the catch? 1228 00:53:27,350 --> 00:53:31,310 Why might this be more of a challenge? 1229 00:53:31,310 --> 00:53:34,590 Or what are the trade-offs here? 1230 00:53:34,590 --> 00:53:38,398 AUDIENCE: It seems like if you drew all your arrows that were updating 1231 00:53:38,398 --> 00:53:39,826 in the back-end there-- 1232 00:53:39,826 --> 00:53:41,527 BRIAN YU: If we draw all the arrows-- 1233 00:53:41,527 --> 00:53:45,820 AUDIENCE: So it's like, what is the difference of having three versus one? 1234 00:53:45,820 --> 00:53:47,260 How would it start to look like? 1235 00:53:47,260 --> 00:53:47,510 BRIAN YU: Yeah. 1236 00:53:47,510 --> 00:53:47,770 Sure. 1237 00:53:47,770 --> 00:53:49,950 So, certainly, once we start to draw all the arrows, 1238 00:53:49,950 --> 00:53:52,500 all the updates that have to happen, updates between all the different 1239 00:53:52,500 --> 00:53:55,416 databases going in both directions-- server one needs to update server 1240 00:53:55,416 --> 00:53:57,230 three, and three needs to update one-- 1241 00:53:57,230 --> 00:53:59,610 that this picture starts to get complicated. 1242 00:53:59,610 --> 00:54:03,340 And it starts to introduce potential problems that could come up. 1243 00:54:03,340 --> 00:54:06,515 So, certainly, one is that as we have more and more databases, now 1244 00:54:06,515 --> 00:54:08,640 we need to have more and more of these updates that 1245 00:54:08,640 --> 00:54:10,380 are happening with each other. 1246 00:54:10,380 --> 00:54:12,810 And what other problems can come up now that we 1247 00:54:12,810 --> 00:54:15,450 have this, all of these different updates 1248 00:54:15,450 --> 00:54:17,990 that are all trying to update each other? 1249 00:54:17,990 --> 00:54:18,630 Yeah. 1250 00:54:18,630 --> 00:54:23,040 AUDIENCE: If someone is trying to write to two databases at the same time, 1251 00:54:23,040 --> 00:54:27,064 then you might have duplicate information that doesn't match up. 1252 00:54:27,064 --> 00:54:27,730 BRIAN YU: Great. 1253 00:54:27,730 --> 00:54:30,730 What happens if two users are trying to edit, make updates 1254 00:54:30,730 --> 00:54:33,250 to two different databases at the same time and they 1255 00:54:33,250 --> 00:54:36,080 both register those updates and now try to update each other? 1256 00:54:36,080 --> 00:54:40,000 So there are a number of different ways that these conflicts can come about. 1257 00:54:40,000 --> 00:54:42,880 One is a primary key conflict, where imagine 1258 00:54:42,880 --> 00:54:46,570 if there are 27 users inside of a user's database, 1259 00:54:46,570 --> 00:54:51,220 and a user registers on this database, and a user registers on this database. 1260 00:54:51,220 --> 00:54:53,950 Well, user over here, they get added to the user's table. 1261 00:54:53,950 --> 00:54:55,330 There were 27 users before. 1262 00:54:55,330 --> 00:54:57,464 So new users going to be user number 28. 1263 00:54:57,464 --> 00:54:59,380 And then over here, if the user is registering 1264 00:54:59,380 --> 00:55:02,470 at the same time, some different user, this database 1265 00:55:02,470 --> 00:55:04,150 also sees that there are 27 users. 1266 00:55:04,150 --> 00:55:07,390 And so it's also going to add this is now user number 28. 1267 00:55:07,390 --> 00:55:11,230 Now we have two different users that both have ID number 28. 1268 00:55:11,230 --> 00:55:14,440 And so when all the updates happen and they try to sync up with each other, 1269 00:55:14,440 --> 00:55:16,356 now we're going to run into potential problems 1270 00:55:16,356 --> 00:55:18,854 because now we have two rows that have the same ID field. 1271 00:55:18,854 --> 00:55:20,770 And that's not allowable because our ID field, 1272 00:55:20,770 --> 00:55:23,210 presumably, is supposed to be unique. 1273 00:55:23,210 --> 00:55:25,180 So that's one potential problem. 1274 00:55:25,180 --> 00:55:28,660 Other potential problems include two different databases 1275 00:55:28,660 --> 00:55:31,660 trying to update the same row at the same time, for instance. 1276 00:55:31,660 --> 00:55:35,440 If they're both trying to change the duration of a flight, for instance, 1277 00:55:35,440 --> 00:55:37,910 and one wants to change it to 120 minutes 1278 00:55:37,910 --> 00:55:41,830 and one is trying to change it to 150 minutes, and now 1279 00:55:41,830 --> 00:55:44,920 which one of those databases should we listen to? 1280 00:55:44,920 --> 00:55:47,414 And all sorts of other problems could come up. 1281 00:55:47,414 --> 00:55:50,080 If someone tries to, for instance, delete a row at the same time 1282 00:55:50,080 --> 00:55:52,381 that someone else is trying to edit that same row, 1283 00:55:52,381 --> 00:55:54,880 should the edit take precedence over the delete and keep it? 1284 00:55:54,880 --> 00:55:56,680 Or do we delete it and ignore the edit? 1285 00:55:56,680 --> 00:55:59,620 All of these are conflicts that, ultimately, whatever 1286 00:55:59,620 --> 00:56:02,230 multi-primary replication system you're trying to use 1287 00:56:02,230 --> 00:56:03,969 needs to have rules for how to deal with, 1288 00:56:03,969 --> 00:56:06,760 some systematic way of saying, all right, if these two edits happen 1289 00:56:06,760 --> 00:56:10,660 at the same time, then we should need some mechanism 1290 00:56:10,660 --> 00:56:12,370 of trying to resolve those edits. 1291 00:56:12,370 --> 00:56:15,490 Maybe if they're editing different columns, then it's fine. 1292 00:56:15,490 --> 00:56:17,470 Just update both columns for both rows. 1293 00:56:17,470 --> 00:56:20,980 But if they're editing the same column of the same row, 1294 00:56:20,980 --> 00:56:24,670 then maybe check the time at which it happened and go with the more recent. 1295 00:56:24,670 --> 00:56:26,920 And so there are any number of different rules 1296 00:56:26,920 --> 00:56:30,260 that might get increasingly more complex or sophisticated that come into play. 1297 00:56:30,260 --> 00:56:33,190 But the idea is that the additional complexity 1298 00:56:33,190 --> 00:56:35,620 that we face with multi-primary replication 1299 00:56:35,620 --> 00:56:38,770 is that we need some mechanism for resolving those conflicts. 1300 00:56:38,770 --> 00:56:41,260 We need some way of saying, if two databases that 1301 00:56:41,260 --> 00:56:44,457 are trying to perform updates and those updates conflict with each other, 1302 00:56:44,457 --> 00:56:46,040 how should we deal with those updates? 1303 00:56:46,040 --> 00:56:48,340 And we need rules for how to do that. 1304 00:56:48,340 --> 00:56:52,960 Questions about either single-primary or multi-primary replication or the idea 1305 00:56:52,960 --> 00:56:54,550 of database replication in general? 1306 00:56:54,550 --> 00:56:57,880 1307 00:56:57,880 --> 00:56:58,780 OK. 1308 00:56:58,780 --> 00:57:02,440 So one more topic that we'll talk about with regards to trying to scale up data 1309 00:57:02,440 --> 00:57:03,220 is caching. 1310 00:57:03,220 --> 00:57:06,370 And this is something that will become-- that's very useful as data 1311 00:57:06,370 --> 00:57:07,330 begins to scale. 1312 00:57:07,330 --> 00:57:12,529 And this is all about trying to avoid needing to spend too much time doing 1313 00:57:12,529 --> 00:57:13,820 things that we've already done. 1314 00:57:13,820 --> 00:57:16,720 So the idea of caching is taking data and information 1315 00:57:16,720 --> 00:57:19,960 and storing it in some temporary place for usage later. 1316 00:57:19,960 --> 00:57:23,410 You might imagine that on The New York Times website, 1317 00:57:23,410 --> 00:57:25,990 for instance, the home page of The New York Times 1318 00:57:25,990 --> 00:57:29,817 probably isn't changing too much from one second to the next second. 1319 00:57:29,817 --> 00:57:32,400 Sure, after some number of minutes, a new article might go up, 1320 00:57:32,400 --> 00:57:34,180 and the front page might change. 1321 00:57:34,180 --> 00:57:36,190 But if you load the page and then refresh 1322 00:57:36,190 --> 00:57:38,430 the page, the page that you get again, it's 1323 00:57:38,430 --> 00:57:41,500 probably going to be the exact same page in all likelihood. 1324 00:57:41,500 --> 00:57:44,080 And it probably wouldn't make a whole lot of sense, 1325 00:57:44,080 --> 00:57:47,410 then, for every time someone tries to request the front page of The New York 1326 00:57:47,410 --> 00:57:50,650 Times, for The New York Times to go to its database 1327 00:57:50,650 --> 00:57:55,330 and look up what the most popular recent articles are and look up the latest 1328 00:57:55,330 --> 00:57:58,627 images and what the trending news is and then regenerate that whole page 1329 00:57:58,627 --> 00:58:00,460 and then present it back to you because it's 1330 00:58:00,460 --> 00:58:02,860 going to have to do that for you every time you make a request 1331 00:58:02,860 --> 00:58:04,693 and do it for every other user who is trying 1332 00:58:04,693 --> 00:58:07,550 to access the front page of The New York Times every single time. 1333 00:58:07,550 --> 00:58:10,720 And so what might be a good idea is introducing some idea of caching-- 1334 00:58:10,720 --> 00:58:15,850 the idea of saving what the front page looked like such that 1335 00:58:15,850 --> 00:58:18,226 if a user comes back in a couple seconds, 1336 00:58:18,226 --> 00:58:20,350 and the page hasn't changed, then go ahead and just 1337 00:58:20,350 --> 00:58:22,460 present the same page, for instance. 1338 00:58:22,460 --> 00:58:26,226 So there are multiple different ways by which caching can happen. 1339 00:58:26,226 --> 00:58:28,600 Caching can exist on the client side and the server side, 1340 00:58:28,600 --> 00:58:29,891 and we'll look at both of them. 1341 00:58:29,891 --> 00:58:31,600 And we'll start with client-side caching. 1342 00:58:31,600 --> 00:58:34,390 And this is something you might already have some familiarity with. 1343 00:58:34,390 --> 00:58:37,300 That if you've been working with JavaScript files in project two, 1344 00:58:37,300 --> 00:58:40,210 for instance, and you've made edits to your JavaScript file, 1345 00:58:40,210 --> 00:58:43,880 and then you check your web page, what sometimes happens? 1346 00:58:43,880 --> 00:58:44,380 Yeah. 1347 00:58:44,380 --> 00:58:46,170 AUDIENCE: You still get the old JavaScript. 1348 00:58:46,170 --> 00:58:47,400 BRIAN YU: You still get the old JavaScript file. 1349 00:58:47,400 --> 00:58:47,660 Right. 1350 00:58:47,660 --> 00:58:50,080 Even though you've made changes to your JavaScript file 1351 00:58:50,080 --> 00:58:53,409 and you've saved those changes, when on your web browser you refresh the page 1352 00:58:53,409 --> 00:58:56,200 or go back to that page that's supposed to have the new JavaScript, 1353 00:58:56,200 --> 00:58:58,390 you still get the old JavaScript because-- 1354 00:58:58,390 --> 00:59:00,640 and the reason for that is client-side caching. 1355 00:59:00,640 --> 00:59:03,310 Your web browser has saved the old JavaScript file, 1356 00:59:03,310 --> 00:59:06,520 and it's just assuming that that file probably hasn't changed. 1357 00:59:06,520 --> 00:59:10,450 And therefore, rather than go through the additional time 1358 00:59:10,450 --> 00:59:13,510 expense of ask the server to send me the JavaScript file, 1359 00:59:13,510 --> 00:59:16,090 get the JavaScript file back, it's just looking locally 1360 00:59:16,090 --> 00:59:18,580 to its own cache, which is faster to access 1361 00:59:18,580 --> 00:59:21,260 and just using that JavaScript file instead. 1362 00:59:21,260 --> 00:59:24,799 And so while that might be an annoying use case of caching, in practice 1363 00:59:24,799 --> 00:59:26,590 it's actually quite helpful if we ever want 1364 00:59:26,590 --> 00:59:30,010 to have some resource that is going to persist for some amount of time, 1365 00:59:30,010 --> 00:59:34,034 something that we want to be kept inside of the cache, 1366 00:59:34,034 --> 00:59:36,200 because it's probably not going to change too often. 1367 00:59:36,200 --> 00:59:40,300 And so inside of an HTTP response, when the web server 1368 00:59:40,300 --> 00:59:43,900 responds back to the user and presents it with the body of the response, 1369 00:59:43,900 --> 00:59:46,180 it contains the page to actually load. 1370 00:59:46,180 --> 00:59:51,672 The server also responds with HTTP headers, information about the request 1371 00:59:51,672 --> 00:59:53,380 that the client web browser, whether it's 1372 00:59:53,380 --> 00:59:56,200 Chrome or Safari or something else, knows how to interpret 1373 00:59:56,200 --> 00:59:57,520 and knows how to understand. 1374 00:59:57,520 --> 01:00:01,750 And so one of those headers might be this cache-control header. 1375 01:00:01,750 --> 01:00:04,630 And what the cache-control HTTP header is allowed to do 1376 01:00:04,630 --> 01:00:10,310 is it's allowed to set in the most basic case a maximum age for the page. 1377 01:00:10,310 --> 01:00:13,840 In other words, specify after this number of seconds-- 1378 01:00:13,840 --> 01:00:16,570 or in this case, one day, I believe-- 1379 01:00:16,570 --> 01:00:20,230 that's when you should, if you're requesting the page again, 1380 01:00:20,230 --> 01:00:21,940 actually see if something has changed. 1381 01:00:21,940 --> 01:00:24,070 But within this amount of time, the page probably 1382 01:00:24,070 --> 01:00:28,570 hasn't changed, so don't worry about trying to access it again if you're 1383 01:00:28,570 --> 01:00:30,250 trying to load the same page again. 1384 01:00:30,250 --> 01:00:32,350 Just use the cached version of it. 1385 01:00:32,350 --> 01:00:36,130 And so by putting a line like this inside of your HTTP header-- and web 1386 01:00:36,130 --> 01:00:38,320 frameworks like Flask and Django have ways 1387 01:00:38,320 --> 01:00:42,466 of allowing you to edit what goes into the header, and you can set those. 1388 01:00:42,466 --> 01:00:44,590 And you can look at Flask or Django's documentation 1389 01:00:44,590 --> 01:00:46,000 for looking at how to do that. 1390 01:00:46,000 --> 01:00:48,130 But you can say to the web browser, go ahead 1391 01:00:48,130 --> 01:00:51,160 and save this page in the cache for a day or so such that 1392 01:00:51,160 --> 01:00:54,310 if you come back in a couple hours, no need to contact the server again, 1393 01:00:54,310 --> 01:00:57,460 which might add additional load to the server when it's unnecessary. 1394 01:00:57,460 --> 01:01:00,740 Just go ahead and load that same page. 1395 01:01:00,740 --> 01:01:04,820 So what problems can happen here with cache, with caching on the client side, 1396 01:01:04,820 --> 01:01:06,600 having the web browser cache the page? 1397 01:01:06,600 --> 01:01:12,200 1398 01:01:12,200 --> 01:01:12,700 Yeah. 1399 01:01:12,700 --> 01:01:15,340 AUDIENCE: If the page changes sooner, then you wouldn't know. 1400 01:01:15,340 --> 01:01:15,520 BRIAN YU: Yeah. 1401 01:01:15,520 --> 01:01:16,020 Sure. 1402 01:01:16,020 --> 01:01:19,420 Certainly, if the page changes sooner than this amount of time, then 1403 01:01:19,420 --> 01:01:21,429 when the user tries to go back to that page, 1404 01:01:21,429 --> 01:01:24,220 there's a good chance that they're still going to see the old page, 1405 01:01:24,220 --> 01:01:26,761 that they'll get the old page, and it will load very quickly. 1406 01:01:26,761 --> 01:01:29,489 But whatever the newer version was, they're not going to see it. 1407 01:01:29,489 --> 01:01:31,280 And, certainly, there are ways around this. 1408 01:01:31,280 --> 01:01:34,270 You can hard refresh the page, which usually is going to try and clear 1409 01:01:34,270 --> 01:01:35,410 the cache and just say-- 1410 01:01:35,410 --> 01:01:38,440 really try and access the page by actually talking to the server. 1411 01:01:38,440 --> 01:01:41,720 And so that's something you can do as well. 1412 01:01:41,720 --> 01:01:45,130 But what about the case where maybe this is saying 1413 01:01:45,130 --> 01:01:46,780 you can cache the page for a day? 1414 01:01:46,780 --> 01:01:49,600 What if the next day the page hasn't changed 1415 01:01:49,600 --> 01:01:52,060 or three days later the page still hasn't changed? 1416 01:01:52,060 --> 01:01:55,900 Under this model, we would still go back to the server and say, a day's up, 1417 01:01:55,900 --> 01:01:58,510 so it means that I can't use the cache page anymore. 1418 01:01:58,510 --> 01:02:00,580 I need to go to the server and ask for it again. 1419 01:02:00,580 --> 01:02:02,330 But imagine if it's some big file. 1420 01:02:02,330 --> 01:02:06,740 It's a video or some other large file that might take a long time. 1421 01:02:06,740 --> 01:02:08,684 It wouldn't make a whole lot of sense if we 1422 01:02:08,684 --> 01:02:10,600 were trying to redownloaded it again and again 1423 01:02:10,600 --> 01:02:12,580 and again just because the cache was up. 1424 01:02:12,580 --> 01:02:16,450 And so what's a way that we might be able to enforce 1425 01:02:16,450 --> 01:02:22,720 this idea that the server can send new data if there's been a change 1426 01:02:22,720 --> 01:02:24,631 but doesn't need to? 1427 01:02:24,631 --> 01:02:25,130 Yeah. 1428 01:02:25,130 --> 01:02:29,507 AUDIENCE: Have some ID that every time the server makes a change that they 1429 01:02:29,507 --> 01:02:33,220 can, for example, increment that ID and then just use the headers first 1430 01:02:33,220 --> 01:02:37,254 to see if you have-- if your headers match up, then don't [INAUDIBLE].. 1431 01:02:37,254 --> 01:02:37,920 BRIAN YU: Great. 1432 01:02:37,920 --> 01:02:40,290 So we can use some kind of identifier, some identifier that's 1433 01:02:40,290 --> 01:02:43,080 associated with the resource, whether it's a web page or a video 1434 01:02:43,080 --> 01:02:46,170 or something else, where any time that resource is updated, 1435 01:02:46,170 --> 01:02:47,460 we update that header. 1436 01:02:47,460 --> 01:02:50,730 And in HTTP, we call that header an ETag, or an entity 1437 01:02:50,730 --> 01:02:54,330 tag, which can just be a really long hexadecimal sequence, a sequence 1438 01:02:54,330 --> 01:02:57,570 of numbers and characters, where that is going 1439 01:02:57,570 --> 01:03:01,272 to be uniquely associated with a particular version of a resource. 1440 01:03:01,272 --> 01:03:02,730 That if the resource gets updated-- 1441 01:03:02,730 --> 01:03:04,620 I update the page, or I update the video-- 1442 01:03:04,620 --> 01:03:06,760 then this ETag is going to change. 1443 01:03:06,760 --> 01:03:10,670 And so now how can we use the ETag to implement caching 1444 01:03:10,670 --> 01:03:12,420 or to implement the idea that I don't need 1445 01:03:12,420 --> 01:03:15,510 to redownload the page every time because of this ETag? 1446 01:03:15,510 --> 01:03:16,350 What can I do? 1447 01:03:16,350 --> 01:03:16,850 Yeah. 1448 01:03:16,850 --> 01:03:20,203 AUDIENCE: Every time you're doing a get request to a server sent, the ETag 1449 01:03:20,203 --> 01:03:23,556 that you have, and then the server, if it matches up, 1450 01:03:23,556 --> 01:03:25,472 it'll tell you like, no need to reload. 1451 01:03:25,472 --> 01:03:27,484 Otherwise, it will send you the new file. 1452 01:03:27,484 --> 01:03:28,150 BRIAN YU: Great. 1453 01:03:28,150 --> 01:03:31,420 So when the user is trying to request the page, 1454 01:03:31,420 --> 01:03:34,780 the user can send along the ETag with the request, say, here is the ETag, 1455 01:03:34,780 --> 01:03:37,270 here's the version of the request that I have. 1456 01:03:37,270 --> 01:03:39,454 And the server can look at that ETag and say, 1457 01:03:39,454 --> 01:03:41,870 does this match up with the latest version of the resource 1458 01:03:41,870 --> 01:03:43,310 that I have on the server? 1459 01:03:43,310 --> 01:03:45,100 And if it does match up, then rather than 1460 01:03:45,100 --> 01:03:50,120 send the whole contents of the page, rather than send the whole video again, 1461 01:03:50,120 --> 01:03:53,620 the server can just respond with usually a 305 status code-- 1462 01:03:53,620 --> 01:03:56,290 304, which stands for not modified-- just to say there's 1463 01:03:56,290 --> 01:03:58,780 been no change to the content you're trying to request. 1464 01:03:58,780 --> 01:04:00,820 Go ahead and just use your cache version. 1465 01:04:00,820 --> 01:04:03,730 It's still fresh, and it's not stale, as we'll often 1466 01:04:03,730 --> 01:04:05,590 call it with regards to caching. 1467 01:04:05,590 --> 01:04:08,290 And the result of that is the responds can happen quickly. 1468 01:04:08,290 --> 01:04:10,150 The server doesn't have to get too loaded. 1469 01:04:10,150 --> 01:04:12,499 And the client can know the ETag is the same. 1470 01:04:12,499 --> 01:04:13,540 The resource is the same. 1471 01:04:13,540 --> 01:04:15,640 I can just use the version in the cache. 1472 01:04:15,640 --> 01:04:17,860 Of course, on the flip side of things, if the user 1473 01:04:17,860 --> 01:04:21,110 sends along an ETag with the request saying, I'm requesting this page, 1474 01:04:21,110 --> 01:04:24,220 here's the ETag of the last time that I visited this page, 1475 01:04:24,220 --> 01:04:27,640 if the page has changed and the server detects that, OK, wait a minute, 1476 01:04:27,640 --> 01:04:31,630 this ETag is different from the ETag of the latest version of the resource, 1477 01:04:31,630 --> 01:04:34,720 now the server knows that we need to give 1478 01:04:34,720 --> 01:04:36,520 the user a fresh copy of that resource. 1479 01:04:36,520 --> 01:04:39,490 And the server can now do that processing, get the resource, 1480 01:04:39,490 --> 01:04:40,880 and deliver it to the user. 1481 01:04:40,880 --> 01:04:43,540 And so this client-side caching serves two benefits, really. 1482 01:04:43,540 --> 01:04:48,370 Number one is that it's faster for the user, that from the user perspective 1483 01:04:48,370 --> 01:04:50,429 they can often see the resource load faster 1484 01:04:50,429 --> 01:04:52,720 because it's loading from their own computer as opposed 1485 01:04:52,720 --> 01:04:56,045 to having to be transferred over the internet from one server to the client. 1486 01:04:56,045 --> 01:04:58,420 And on the other side, it helps from the load perspective 1487 01:04:58,420 --> 01:05:01,544 that if you have hundreds of users and are all trying to access your server 1488 01:05:01,544 --> 01:05:03,320 and access your database at the same time, 1489 01:05:03,320 --> 01:05:05,920 any time you can tell some subset of those users, 1490 01:05:05,920 --> 01:05:08,470 you don't really need to access the server or the database, 1491 01:05:08,470 --> 01:05:10,910 you can just use a version of the site you already have, 1492 01:05:10,910 --> 01:05:12,040 that's going to be a benefit to you. 1493 01:05:12,040 --> 01:05:13,930 That's going to be less load on your website. 1494 01:05:13,930 --> 01:05:17,710 That's going to help you as you think about scaling your website. 1495 01:05:17,710 --> 01:05:19,960 Questions about client-side caching? 1496 01:05:19,960 --> 01:05:22,960 1497 01:05:22,960 --> 01:05:23,460 All right. 1498 01:05:23,460 --> 01:05:25,500 So let's talk about server-side caching, which 1499 01:05:25,500 --> 01:05:27,120 is another place where caches can be. 1500 01:05:27,120 --> 01:05:29,850 And caches can exist all throughout this entire process, 1501 01:05:29,850 --> 01:05:31,290 whether they're large or small. 1502 01:05:31,290 --> 01:05:33,020 They can be located in different places. 1503 01:05:33,020 --> 01:05:37,830 And one thing we didn't mention was that if you have a cache that 1504 01:05:37,830 --> 01:05:40,230 maybe works for your entire network-- 1505 01:05:40,230 --> 01:05:42,550 actually, we'll talk about that one more time. 1506 01:05:42,550 --> 01:05:47,394 So imagine that you have some cache that's working for your local network, 1507 01:05:47,394 --> 01:05:49,560 for instance, your computer and other computers that 1508 01:05:49,560 --> 01:05:52,830 are all connected to the same network. 1509 01:05:52,830 --> 01:05:58,080 What could go wrong with something like cache-control or the ETag 1510 01:05:58,080 --> 01:06:02,628 where you might not want for the page to be cached? 1511 01:06:02,628 --> 01:06:03,765 AUDIENCE: Security. 1512 01:06:03,765 --> 01:06:05,140 BRIAN YU: Security reasons, sure. 1513 01:06:05,140 --> 01:06:08,200 You might imagine that there's a difference between public websites 1514 01:06:08,200 --> 01:06:12,910 and private websites or private pages, that if facebook.com, for instance, 1515 01:06:12,910 --> 01:06:15,130 were something that were just consistently cached, 1516 01:06:15,130 --> 01:06:19,870 then if I visited Facebook and saw my news feed and it was cache, 1517 01:06:19,870 --> 01:06:23,260 then I wouldn't want it for-- if someone else on my network or someone else 1518 01:06:23,260 --> 01:06:26,469 using my computer were to also go to Facebook, if they were to-- 1519 01:06:26,469 --> 01:06:29,260 on their account, if they were to also see the same content of what 1520 01:06:29,260 --> 01:06:31,301 I just saw because it was pulling from the cache. 1521 01:06:31,301 --> 01:06:34,150 And so inside this cache-control header, you additionally 1522 01:06:34,150 --> 01:06:36,969 have the option of specifying do I want this to be a public cache, 1523 01:06:36,969 --> 01:06:39,260 meaning a page that anyone can see, or a private cache, 1524 01:06:39,260 --> 01:06:40,600 which should be authenticated. 1525 01:06:40,600 --> 01:06:43,780 And so there are additional settings inside the cache-control that you 1526 01:06:43,780 --> 01:06:46,270 can set in order to make sure the cache is behaving the way 1527 01:06:46,270 --> 01:06:47,170 that you want it to behave. 1528 01:06:47,170 --> 01:06:49,086 We won't go into too many of the details here. 1529 01:06:49,086 --> 01:06:52,570 But know that you have that kind of control and flexibility over the cache 1530 01:06:52,570 --> 01:06:55,870 just by setting it inside of the headers of the HTTP response 1531 01:06:55,870 --> 01:06:57,930 that you're sending back to the user. 1532 01:06:57,930 --> 01:06:59,820 But back to server-side caching. 1533 01:06:59,820 --> 01:07:01,570 So the idea of server-side caching now is, 1534 01:07:01,570 --> 01:07:04,300 instead of having the cache stored locally 1535 01:07:04,300 --> 01:07:07,090 on the computer of the user inside of chrome or Safari 1536 01:07:07,090 --> 01:07:10,090 or whatever web browser they're using, we can add to our model 1537 01:07:10,090 --> 01:07:12,760 here, where in addition to having servers that are all 1538 01:07:12,760 --> 01:07:15,040 talking to the database, have all the servers that 1539 01:07:15,040 --> 01:07:16,446 are now connected to the cache. 1540 01:07:16,446 --> 01:07:19,570 Now, of course, we have a whole bunch of new single points of failure here. 1541 01:07:19,570 --> 01:07:23,200 Our databases is a single point of failure, and so is our cache. 1542 01:07:23,200 --> 01:07:26,812 But why might we want to add a cache to this image? 1543 01:07:26,812 --> 01:07:28,270 It certainly complicates the image. 1544 01:07:28,270 --> 01:07:30,890 But what benefit do we get from it? 1545 01:07:30,890 --> 01:07:31,544 Yeah, sure. 1546 01:07:31,544 --> 01:07:34,508 AUDIENCE: So if something is common, then you still write to the cache 1547 01:07:34,508 --> 01:07:37,142 so you don't have to hit the database because the-- well, 1548 01:07:37,142 --> 01:07:39,584 the cache is faster than the database. 1549 01:07:39,584 --> 01:07:40,250 BRIAN YU: Great. 1550 01:07:40,250 --> 01:07:42,110 So the cache is likely going to be faster 1551 01:07:42,110 --> 01:07:44,030 than the database in certain respects, usually 1552 01:07:44,030 --> 01:07:46,760 because if we're trying to render something complicated, 1553 01:07:46,760 --> 01:07:49,430 like the front page of The New York Times 1554 01:07:49,430 --> 01:07:53,870 or you imagine Amazon has a page that shows the most popular books, 1555 01:07:53,870 --> 01:07:57,260 it might take a fair amount of energy and computational resources 1556 01:07:57,260 --> 01:07:59,550 to query the most popular books from the database. 1557 01:07:59,550 --> 01:08:00,050 Right? 1558 01:08:00,050 --> 01:08:01,883 There's some algorithm involved whereby it's 1559 01:08:01,883 --> 01:08:04,764 going to look at books that have been purchased frequently recently. 1560 01:08:04,764 --> 01:08:07,430 So it might need to look for orders and look at different books, 1561 01:08:07,430 --> 01:08:09,138 and it might be multiple different tables 1562 01:08:09,138 --> 01:08:12,890 that have to be queried in order to get what are the top 10 most popular books. 1563 01:08:12,890 --> 01:08:15,500 And that's going to be an expensive database operation. 1564 01:08:15,500 --> 01:08:19,119 Whereas once you've gotten those 10 most popular books the first time, 1565 01:08:19,119 --> 01:08:21,410 one thing you can do is just take all that information, 1566 01:08:21,410 --> 01:08:25,189 put it inside the cache, and store it there, whereby on future instances, 1567 01:08:25,189 --> 01:08:28,596 if a user comes by within the next couple seconds and says, 1568 01:08:28,596 --> 01:08:30,470 I want to see the Amazon home page and I want 1569 01:08:30,470 --> 01:08:32,729 to see what the 10 most popular books are, 1570 01:08:32,729 --> 01:08:35,479 rather than repeat those queries again and go back to the database 1571 01:08:35,479 --> 01:08:37,939 and query that information, we can just look to the cache 1572 01:08:37,939 --> 01:08:41,840 where we've stored potentially in a file somewhere here are the 10 most popular 1573 01:08:41,840 --> 01:08:42,620 books. 1574 01:08:42,620 --> 01:08:44,720 And we can just use that cache information 1575 01:08:44,720 --> 01:08:47,977 to display that information back to the user. 1576 01:08:47,977 --> 01:08:49,310 So what drawbacks come up there? 1577 01:08:49,310 --> 01:08:51,309 What are the trade-offs we face when we do that? 1578 01:08:51,309 --> 01:08:54,542 1579 01:08:54,542 --> 01:08:57,084 AUDIENCE: You need to take care of when to update the cache. 1580 01:08:57,084 --> 01:08:57,750 BRIAN YU: Great. 1581 01:08:57,750 --> 01:08:59,609 So any time we're dealing with the cache, 1582 01:08:59,609 --> 01:09:01,979 we always have these issues of cache invalidation. 1583 01:09:01,979 --> 01:09:05,430 What happens when data inside of the database 1584 01:09:05,430 --> 01:09:07,979 is more recent than data inside of the cache? 1585 01:09:07,979 --> 01:09:09,670 How do we deal with that type of thing? 1586 01:09:09,670 --> 01:09:11,350 And so multiple ways that we could do that. 1587 01:09:11,350 --> 01:09:12,933 How could we deal with that situation? 1588 01:09:12,933 --> 01:09:16,439 What are some ideas for how to deal with a problem 1589 01:09:16,439 --> 01:09:20,069 where maybe the 10 most recent, most popular books 1590 01:09:20,069 --> 01:09:23,880 are no longer valid because a bunch of people bought book number 11, 1591 01:09:23,880 --> 01:09:26,340 and now that's the new 10th most popular book? 1592 01:09:26,340 --> 01:09:30,170 And so what's in the cache is no longer valid. 1593 01:09:30,170 --> 01:09:30,670 Strategies? 1594 01:09:30,670 --> 01:09:31,420 There are multiple. 1595 01:09:31,420 --> 01:09:31,920 Yeah. 1596 01:09:31,920 --> 01:09:34,636 AUDIENCE: So you don't care about the recent database. 1597 01:09:34,636 --> 01:09:36,594 But if someone writes in the database, then you 1598 01:09:36,594 --> 01:09:38,170 can update the cache [INAUDIBLE]. 1599 01:09:38,170 --> 01:09:38,439 BRIAN YU: Great. 1600 01:09:38,439 --> 01:09:41,630 So we could add logic that says that when we're writing to the database, 1601 01:09:41,630 --> 01:09:45,490 if we place a new order, then we should also make sure that the cache gets 1602 01:09:45,490 --> 01:09:48,850 updated, that we invalidate any old information in the cache, get rid of it 1603 01:09:48,850 --> 01:09:51,850 such that the next time the user makes a request, we're doing that anew. 1604 01:09:51,850 --> 01:09:54,460 1605 01:09:54,460 --> 01:09:57,247 So depending on the system that you have and depending 1606 01:09:57,247 --> 01:10:00,580 on what types of reads and writes you're doing, that may or may not be feasible. 1607 01:10:00,580 --> 01:10:02,699 In the case of 10 most popular books, you probably 1608 01:10:02,699 --> 01:10:04,740 don't want it such that any time anyone purchases 1609 01:10:04,740 --> 01:10:07,939 a book that we're invalidating the cache of what the 10 most popular are. 1610 01:10:07,939 --> 01:10:09,730 But, certainly, you can think of heuristics 1611 01:10:09,730 --> 01:10:12,910 that we might employ in order to help make that process easier. 1612 01:10:12,910 --> 01:10:16,120 How else might we implement cache invalidation, this idea 1613 01:10:16,120 --> 01:10:20,499 that if we have data in the cache, then at some point that data 1614 01:10:20,499 --> 01:10:21,790 is no longer going to be valid? 1615 01:10:21,790 --> 01:10:26,263 1616 01:10:26,263 --> 01:10:29,742 AUDIENCE: Couldn't you do a similar thing with the ID? 1617 01:10:29,742 --> 01:10:33,718 Like, the cache could store an ID for a particular set of data. 1618 01:10:33,718 --> 01:10:37,694 And so then when somebody requests that data from the cache, 1619 01:10:37,694 --> 01:10:41,774 it checks the database to see if it needs to get an updated version. 1620 01:10:41,774 --> 01:10:42,440 BRIAN YU: Great. 1621 01:10:42,440 --> 01:10:46,282 So we could have some mechanism via which the cache is checking the data-- 1622 01:10:46,282 --> 01:10:47,990 or we have something in the server that's 1623 01:10:47,990 --> 01:10:51,320 checking the database to see do we actually need to check the database. 1624 01:10:51,320 --> 01:10:54,529 Or can we just go from the cache by having some identifier that updates, 1625 01:10:54,529 --> 01:10:55,070 for instance? 1626 01:10:55,070 --> 01:10:58,230 And maybe that operation is slightly less expensive on the database. 1627 01:10:58,230 --> 01:11:00,200 So it's not needing to perform that full query, 1628 01:11:00,200 --> 01:11:01,670 but we still do a quick check to see if there's 1629 01:11:01,670 --> 01:11:03,150 anything we might need to change. 1630 01:11:03,150 --> 01:11:04,816 Otherwise, we can still go to the cache. 1631 01:11:04,816 --> 01:11:06,108 That's certainly an option too. 1632 01:11:06,108 --> 01:11:08,607 And there are many other ways that we could potentially deal 1633 01:11:08,607 --> 01:11:10,430 with the problem of cache invalidation. 1634 01:11:10,430 --> 01:11:13,160 One common way is just to effectively ignore the problem. 1635 01:11:13,160 --> 01:11:16,370 Set an expiration time on the cache and say, the 10 most popular books, 1636 01:11:16,370 --> 01:11:19,090 this will expire after 12 hours, for instance. 1637 01:11:19,090 --> 01:11:22,910 And it's probably not a big deal if a new book comes in the top 10, 1638 01:11:22,910 --> 01:11:24,260 and it's not updated right away. 1639 01:11:24,260 --> 01:11:26,870 If you don't care that much and you're OK with the cache being 1640 01:11:26,870 --> 01:11:29,060 a little bit out of date, then that's OK so long 1641 01:11:29,060 --> 01:11:31,880 as you have some sort of expiration time on the cache to say, 1642 01:11:31,880 --> 01:11:34,370 after X number of minutes or hours or days, 1643 01:11:34,370 --> 01:11:37,430 then we should invalidate it and then check the database again 1644 01:11:37,430 --> 01:11:39,260 to see what the latest information is. 1645 01:11:39,260 --> 01:11:42,390 And so the big takeaway from all of this is-- 1646 01:11:42,390 --> 01:11:46,790 whether we're talking about caching or whether we're talking about database 1647 01:11:46,790 --> 01:11:51,350 scalability in terms of partitioning it or replicating it into different places 1648 01:11:51,350 --> 01:11:54,450 or we're thinking about how we're going to load balance our servers, 1649 01:11:54,450 --> 01:11:58,550 whether we're using vertical, whether we're expanding them vertically 1650 01:11:58,550 --> 01:12:01,592 or scaling them horizontally or some combination of all of these things-- 1651 01:12:01,592 --> 01:12:03,883 that there are trade-offs with each of those decisions. 1652 01:12:03,883 --> 01:12:07,190 And we have to decide whether, based on the needs of our particular system, 1653 01:12:07,190 --> 01:12:09,740 based on the needs of our particular web application, whether there are more 1654 01:12:09,740 --> 01:12:12,073 writes than there are reads and what sorts of operations 1655 01:12:12,073 --> 01:12:14,920 are commonly happening, that we need to make these design decisions. 1656 01:12:14,920 --> 01:12:18,800 And so one of the goals today was really to get across the idea 1657 01:12:18,800 --> 01:12:21,290 that, with all of these different moving parts, 1658 01:12:21,290 --> 01:12:24,680 we can be thinking critically about what design decisions we make, 1659 01:12:24,680 --> 01:12:26,930 how we choose to design our system in a way such 1660 01:12:26,930 --> 01:12:31,280 that it is scalable based on the specific needs of our system or our web 1661 01:12:31,280 --> 01:12:32,510 application. 1662 01:12:32,510 --> 01:12:35,152 Questions about any of those things so far? 1663 01:12:35,152 --> 01:12:36,015 Yeah. 1664 01:12:36,015 --> 01:12:36,640 AUDIENCE: Yeah. 1665 01:12:36,640 --> 01:12:38,624 I have a question about, basically, the cache. 1666 01:12:38,624 --> 01:12:41,104 Shouldn't the cache be memory on the server? 1667 01:12:41,104 --> 01:12:44,149 Or is it actually its own hardware [INAUDIBLE]?? 1668 01:12:44,149 --> 01:12:45,190 BRIAN YU: Great question. 1669 01:12:45,190 --> 01:12:47,190 So what form does the cache actually take? 1670 01:12:47,190 --> 01:12:49,010 So, certainly, there are a bunch of different forms 1671 01:12:49,010 --> 01:12:50,051 that that cache can take. 1672 01:12:50,051 --> 01:12:52,289 And oftentimes we might have a smaller cache 1673 01:12:52,289 --> 01:12:55,080 that's actually physically located on the server, where we wouldn't 1674 01:12:55,080 --> 01:12:57,270 need to talk to something external. 1675 01:12:57,270 --> 01:13:00,850 But there are other cases where you might want an external cache. 1676 01:13:00,850 --> 01:13:04,530 Well, what's one benefit that an external cache does give you instead 1677 01:13:04,530 --> 01:13:07,990 of just storing a cache on one server? 1678 01:13:07,990 --> 01:13:09,607 AUDIENCE: More space [INAUDIBLE]. 1679 01:13:09,607 --> 01:13:10,940 BRIAN YU: More space, certainly. 1680 01:13:10,940 --> 01:13:14,100 So a cache might be able to store large amounts of data. 1681 01:13:14,100 --> 01:13:17,090 And usually a cache is just going to be, basically, hard 1682 01:13:17,090 --> 01:13:21,750 disk storage where you can just easily access it, access it very quickly. 1683 01:13:21,750 --> 01:13:23,810 Amazon Web Services, for instance, offers 1684 01:13:23,810 --> 01:13:25,691 S3, which is effectively a service that's 1685 01:13:25,691 --> 01:13:28,190 just a big hard drive in the cloud where you can store files 1686 01:13:28,190 --> 01:13:30,710 and is often used for caching purposes. 1687 01:13:30,710 --> 01:13:34,970 What's another benefit of just storing, of using an external cache located 1688 01:13:34,970 --> 01:13:37,490 on some separate hard drive somewhere that's not 1689 01:13:37,490 --> 01:13:40,939 within any one of the servers but that all the servers talk to? 1690 01:13:40,939 --> 01:13:42,980 So we talked about the drawback, which is that it 1691 01:13:42,980 --> 01:13:44,390 takes longer to talk to the cache. 1692 01:13:44,390 --> 01:13:45,264 But what's a benefit? 1693 01:13:45,264 --> 01:13:46,975 AUDIENCE: One primary source. 1694 01:13:46,975 --> 01:13:47,600 BRIAN YU: Yeah. 1695 01:13:47,600 --> 01:13:49,474 It's a primary source that all of the servers 1696 01:13:49,474 --> 01:13:53,000 have access to, that if server number one has cached the 10 most 1697 01:13:53,000 --> 01:13:56,060 popular books, that if someone's on server two and tries to access 10 1698 01:13:56,060 --> 01:13:59,030 most popular books, they can also access that same cache as opposed 1699 01:13:59,030 --> 01:14:02,350 to in the case where if all the caches are only stored on the servers, 1700 01:14:02,350 --> 01:14:05,660 now each of those servers needs to independently generate it and maintain 1701 01:14:05,660 --> 01:14:06,620 its own cache. 1702 01:14:06,620 --> 01:14:08,030 And it has those issues as well. 1703 01:14:08,030 --> 01:14:10,782 In reality, though, most web applications 1704 01:14:10,782 --> 01:14:13,490 that begin to scale larger and larger have many different caches. 1705 01:14:13,490 --> 01:14:16,615 They will have caches on the server for quicker things that need to happen. 1706 01:14:16,615 --> 01:14:20,150 And maybe you're using a separate cache in order 1707 01:14:20,150 --> 01:14:23,030 to deal with larger files that need to be cache, for instance. 1708 01:14:23,030 --> 01:14:25,790 And they'll use some combination of all these caching techniques 1709 01:14:25,790 --> 01:14:28,430 in order to get the best of both worlds, ideally, 1710 01:14:28,430 --> 01:14:30,680 to try and have quick access on the server to things 1711 01:14:30,680 --> 01:14:35,570 that we need access to quickly, to store on the off-site cache information 1712 01:14:35,570 --> 01:14:37,880 that we want all the servers to be able to access. 1713 01:14:37,880 --> 01:14:40,588 And so usually you'll see some combination of all of these things 1714 01:14:40,588 --> 01:14:44,000 in practice as real web applications begin to scale. 1715 01:14:44,000 --> 01:14:44,500 Yep. 1716 01:14:44,500 --> 01:14:45,583 AUDIENCE: Question online. 1717 01:14:45,583 --> 01:14:48,820 What would be a good way for estimating the number of servers, databases, 1718 01:14:48,820 --> 01:14:51,629 load balancers, and caches that you would need for an application? 1719 01:14:51,629 --> 01:14:52,670 BRIAN YU: Great question. 1720 01:14:52,670 --> 01:14:56,030 So what's a good way to estimate the amount that you would actually need? 1721 01:14:56,030 --> 01:14:58,155 And so we talked a little bit at the very beginning 1722 01:14:58,155 --> 01:15:00,260 about benchmarking, about the process of trying 1723 01:15:00,260 --> 01:15:03,410 to test to see how much load the server can actually take. 1724 01:15:03,410 --> 01:15:05,960 And so there are a number of different pieces of software 1725 01:15:05,960 --> 01:15:08,600 that you can use in order to perform that benchmarking. 1726 01:15:08,600 --> 01:15:13,310 I know ApacheBench, I believe, is one common piece of load balancing, 1727 01:15:13,310 --> 01:15:15,890 of benchmarking software that you can use. 1728 01:15:15,890 --> 01:15:18,680 And, also, one good strategy is, if you're using cloud computing, 1729 01:15:18,680 --> 01:15:22,040 look to the documentation of wherever you're getting those servers from. 1730 01:15:22,040 --> 01:15:25,190 And they'll likely include information about the processing power 1731 01:15:25,190 --> 01:15:26,840 and the memory of those computers. 1732 01:15:26,840 --> 01:15:31,880 And so in the AWS case, for instance, one of the more popular servers tools 1733 01:15:31,880 --> 01:15:35,090 is EC2, Elastic Compute Cloud, which is just the service 1734 01:15:35,090 --> 01:15:39,050 that AWS offers that lets you effectively rent servers in the cloud 1735 01:15:39,050 --> 01:15:42,170 and run servers like these that might be connected to a load balancer, 1736 01:15:42,170 --> 01:15:43,000 for instance. 1737 01:15:43,000 --> 01:15:45,420 And they come in different sizes. 1738 01:15:45,420 --> 01:15:47,930 They have different names, different letters and numbers, 1739 01:15:47,930 --> 01:15:49,388 like smalls and mediums and larges. 1740 01:15:49,388 --> 01:15:52,880 And you can look on their website as to what each one of those servers means 1741 01:15:52,880 --> 01:15:54,670 and how much computing power it has. 1742 01:15:54,670 --> 01:15:57,432 And using that, you can begin to gauge which one you need 1743 01:15:57,432 --> 01:15:59,015 in order for your particular purposes. 1744 01:15:59,015 --> 01:16:01,810 1745 01:16:01,810 --> 01:16:02,310 All right. 1746 01:16:02,310 --> 01:16:04,190 So those were just-- that was just a brief introduction 1747 01:16:04,190 --> 01:16:06,981 to some of the high-level concepts that come into play as you think 1748 01:16:06,981 --> 01:16:08,617 about how to scale your application. 1749 01:16:08,617 --> 01:16:11,450 When I come back next time, we'll be talking about security, about-- 1750 01:16:11,450 --> 01:16:12,116 in the same way. 1751 01:16:12,116 --> 01:16:15,972 As we take our applications and begin to scale them to deploy them 1752 01:16:15,972 --> 01:16:18,180 to the internet and are used by many different users, 1753 01:16:18,180 --> 01:16:22,186 how do we make sure that our software is secure from adversaries 1754 01:16:22,186 --> 01:16:24,560 that might be trying to attack the website, for instance? 1755 01:16:24,560 --> 01:16:26,390 And what considerations go into there? 1756 01:16:26,390 --> 01:16:29,348 And so that's it for today, for Web Program with Python and JavaScript. 1757 01:16:29,348 --> 01:16:31,090 Thank you all so much. 1758 01:16:31,090 --> 01:16:32,255