WEBVTT X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000 00:00:10.360 --> 00:00:12.340 Cloud computing-- it's this term that rather 00:00:12.340 --> 00:00:14.330 swept onto the scene in recent years. 00:00:14.330 --> 00:00:17.230 And it sounds like it's some new and trendy technology. 00:00:17.230 --> 00:00:19.769 But in reality, it's really just a very nice packaging 00:00:19.769 --> 00:00:22.060 up of a whole number of technologies that have actually 00:00:22.060 --> 00:00:23.620 been with us for some time. 00:00:23.620 --> 00:00:26.710 In fact, cloud computing, in its simplest form, 00:00:26.710 --> 00:00:29.800 can really be thought of as just outsourcing 00:00:29.800 --> 00:00:33.250 the hosting of your applications and really outsourcing 00:00:33.250 --> 00:00:36.790 the hosting of your physical servers to someone else-- put another way, 00:00:36.790 --> 00:00:41.140 renting space and renting time on someone else's computers. 00:00:41.140 --> 00:00:45.520 But these days, we just have so much computational capabilities-- that is, 00:00:45.520 --> 00:00:50.650 our computers are so fast, our CPUs are so many, and we have so much RAM-- 00:00:50.650 --> 00:00:53.500 that new and fancier technologies have lent themselves 00:00:53.500 --> 00:00:56.590 to this trend of hosting all the more software 00:00:56.590 --> 00:01:00.520 and putting all of the more hardware off-site in the so-called cloud 00:01:00.520 --> 00:01:05.170 so that companies, both big and small, no longer need 00:01:05.170 --> 00:01:09.010 to host their own physical hardware or even a whole number of roles 00:01:09.010 --> 00:01:10.390 in their own local companies. 00:01:10.390 --> 00:01:13.120 And so what we'll do now is dive into cloud computing, 00:01:13.120 --> 00:01:15.280 look at some of the problems it solves, look 00:01:15.280 --> 00:01:18.190 at some of the opportunities it affords, but ultimately, 00:01:18.190 --> 00:01:20.890 take a look from the ground up at what's underneath the hood 00:01:20.890 --> 00:01:23.080 here so that by the end of this, we have a better 00:01:23.080 --> 00:01:25.900 understanding of what the cloud is, why it is useful, 00:01:25.900 --> 00:01:28.030 and what it actually is not. 00:01:28.030 --> 00:01:31.490 So with that said, let's start with a simple scenario. 00:01:31.490 --> 00:01:34.990 Of course, the cloud perhaps derives its origins 00:01:34.990 --> 00:01:37.630 from how the internet, for some time, was drawn, 00:01:37.630 --> 00:01:40.510 which was just this big, nebulous cloud, in that it doesn't really 00:01:40.510 --> 00:01:41.950 matter what's inside that cloud. 00:01:41.950 --> 00:01:46.330 Although at this point, you most surely appreciate that inside of this cloud 00:01:46.330 --> 00:01:49.450 are things like routers, and running through those routers 00:01:49.450 --> 00:01:51.937 are packets, both TCP/IP and the like. 00:01:51.937 --> 00:01:53.770 And underneath the hood, then, of this cloud 00:01:53.770 --> 00:01:57.520 is some transport mechanism that gets data from point A to point B. 00:01:57.520 --> 00:02:00.550 So what might those point A's and Point B's be? 00:02:00.550 --> 00:02:04.180 Well, if this here is my little, old laptop, connected somehow 00:02:04.180 --> 00:02:07.510 to the internet here, and maybe down here there 00:02:07.510 --> 00:02:11.500 is some web server on which lives a whole bunch of web pages-- 00:02:11.500 --> 00:02:12.550 maybe it's my email. 00:02:12.550 --> 00:02:13.840 Maybe it's the day's news. 00:02:13.840 --> 00:02:16.240 Maybe it's some social media site or the like. 00:02:16.240 --> 00:02:21.190 I, at point A, want to somehow connect to point B down here. 00:02:21.190 --> 00:02:24.790 Now, it turns out it's not all that hard to get a website up 00:02:24.790 --> 00:02:26.230 and running on the internet. 00:02:26.230 --> 00:02:28.540 You can, of course, use any number of languages. 00:02:28.540 --> 00:02:30.760 You can use any number of databases. 00:02:30.760 --> 00:02:34.270 And you can do it with relatively little experience, 00:02:34.270 --> 00:02:36.320 just getting something on the internet. 00:02:36.320 --> 00:02:39.250 In fact, it's not all that hard, relatively speaking, 00:02:39.250 --> 00:02:41.740 to get a prototype of your application or even 00:02:41.740 --> 00:02:44.590 your first version of your business up and running. 00:02:44.590 --> 00:02:50.200 But things start to get hard quickly, especially if you have some success. 00:02:50.200 --> 00:02:53.320 Indeed, a good problem to have is that you have so many customers and so 00:02:53.320 --> 00:02:57.100 many users hitting your websites that you can't actually 00:02:57.100 --> 00:02:58.450 handle all of the load. 00:02:58.450 --> 00:03:01.365 Now, it's a good problem in the sense that business is booming. 00:03:01.365 --> 00:03:03.490 But it's, of course, an actual problem in the sense 00:03:03.490 --> 00:03:06.160 that your customers aren't going to be able to visit your web site 00:03:06.160 --> 00:03:08.110 and buy whatever it is you're selling or read 00:03:08.110 --> 00:03:12.520 whatever it is you're posting if your servers can't actually handle the load. 00:03:12.520 --> 00:03:16.780 And by load, I simply mean the number of users per minute or per unit of time 00:03:16.780 --> 00:03:19.390 that your website is actually experiencing. 00:03:19.390 --> 00:03:21.670 And its capacity, meanwhile, would be the number 00:03:21.670 --> 00:03:23.500 of users it can actually support. 00:03:23.500 --> 00:03:25.760 Now, why are there these limits in the first place? 00:03:25.760 --> 00:03:28.030 Well, you may recall that inside of a computer 00:03:28.030 --> 00:03:30.640 is a CPU, the brains of that computer. 00:03:30.640 --> 00:03:33.580 And inside of a computer is some memory, like RAM. 00:03:33.580 --> 00:03:36.940 And there might be some longer-term storage, like hard disk space. 00:03:36.940 --> 00:03:41.320 At the end of the day, all of those resources and more are finite. 00:03:41.320 --> 00:03:44.290 You can only fit so much physical hardware in a computer. 00:03:44.290 --> 00:03:47.500 Humans have only been able to pack so many resources 00:03:47.500 --> 00:03:49.894 into the physical space of a computer. 00:03:49.894 --> 00:03:51.310 And then, of course, there's cost. 00:03:51.310 --> 00:03:54.920 You might be able to only afford so much computing capacity. 00:03:54.920 --> 00:03:58.800 So if a computer can only do some number of things per second, 00:03:58.800 --> 00:04:02.154 there is surely an upper bound on how many people can visit your web 00:04:02.154 --> 00:04:05.320 site, how many people can add things to their shopping cart, how many people 00:04:05.320 --> 00:04:07.330 can check out with their credit card. 00:04:07.330 --> 00:04:10.940 Because you only have, at the end of the day, a finite numbers of resources. 00:04:10.940 --> 00:04:13.060 Now, what does that mean in real terms? 00:04:13.060 --> 00:04:16.540 Well, maybe your web server can handle 100 users per minute. 00:04:16.540 --> 00:04:18.700 Maybe it can handle 1,000 users per minute. 00:04:18.700 --> 00:04:22.240 Maybe it can handle 1,000 users per second, or even much more than that. 00:04:22.240 --> 00:04:26.887 It really depends on the specifications of your hardware-- how much RAM, 00:04:26.887 --> 00:04:29.470 how much CPU and so forth that you actually have-- and it also 00:04:29.470 --> 00:04:33.430 depends, to some extent, on how well-written your code is and how fast 00:04:33.430 --> 00:04:37.280 or how slow your code, your software actually runs. 00:04:37.280 --> 00:04:39.700 So these are knobs that can ultimately be turned. 00:04:39.700 --> 00:04:42.310 And through testing, can you figure this out in advance 00:04:42.310 --> 00:04:46.470 by simulating traffic in order to estimate exactly how many users you 00:04:46.470 --> 00:04:49.000 might be able to handle at a time? 00:04:49.000 --> 00:04:53.200 Now, the relevance to today is that the cloud, so to speak, 00:04:53.200 --> 00:04:56.200 allows us to start to solve some of these problems 00:04:56.200 --> 00:04:59.890 and also allows us to start abstracting away the solutions to some 00:04:59.890 --> 00:05:00.640 of these problems. 00:05:00.640 --> 00:05:02.360 Well, let's see what this actually means. 00:05:02.360 --> 00:05:04.449 So at some point or other-- 00:05:04.449 --> 00:05:06.490 especially when it's not just my laptop, but it's 00:05:06.490 --> 00:05:10.160 like 1,000 laptops, or 10,000 laptops and desktops and phones and more 00:05:10.160 --> 00:05:12.860 that are somehow trying to access my server here-- 00:05:12.860 --> 00:05:17.000 at some point, we hit that upper limit whereby no more users can 00:05:17.000 --> 00:05:19.200 fit onto my web site per unit of time. 00:05:19.200 --> 00:05:22.070 So what is the symptom that my users experience at that point 00:05:22.070 --> 00:05:23.660 if I'm over capacity? 00:05:23.660 --> 00:05:26.210 Well, they might see an error message of some sort. 00:05:26.210 --> 00:05:28.550 They might just experience a spinning icon 00:05:28.550 --> 00:05:30.662 because the website is super slow to respond. 00:05:30.662 --> 00:05:33.120 And maybe it does respond, but maybe it's 10 seconds later. 00:05:33.120 --> 00:05:36.890 So at the end of the day, they either have a bad experience or no experience 00:05:36.890 --> 00:05:42.480 whatsoever, because my server can only handle so many requests at a time. 00:05:42.480 --> 00:05:44.570 So what do you do to solve this problem? 00:05:44.570 --> 00:05:48.500 If one server is not enough, maybe the most intuitive solution is, well, 00:05:48.500 --> 00:05:51.500 if one server is not giving me enough headroom, 00:05:51.500 --> 00:05:53.490 why don't I just have two servers? 00:05:53.490 --> 00:05:54.890 So let's go ahead and do that. 00:05:54.890 --> 00:05:58.370 Instead of having just one server, let's go ahead and have two. 00:05:58.370 --> 00:06:01.790 And let me propose that on the second server, it's the exact same software. 00:06:01.790 --> 00:06:05.150 So whatever code I've written, in whatever language it's written, 00:06:05.150 --> 00:06:08.600 I just have copies of my web site on both the original server 00:06:08.600 --> 00:06:10.520 and the second server. 00:06:10.520 --> 00:06:14.270 Now I've solved the problem in the simple sense 00:06:14.270 --> 00:06:16.010 that I've doubled my capacity. 00:06:16.010 --> 00:06:18.570 If one server can handle 1,000 people per second, 00:06:18.570 --> 00:06:21.714 well, then surely two servers can handle 2,000 people per second, 00:06:21.714 --> 00:06:22.880 so I've doubled my capacity. 00:06:22.880 --> 00:06:23.870 So that's good. 00:06:23.870 --> 00:06:26.000 I've hopefully solved the problem. 00:06:26.000 --> 00:06:27.929 But it's not quite as simple as that. 00:06:27.929 --> 00:06:30.845 At least pictorially, I'm still pointing at just one of those servers, 00:06:30.845 --> 00:06:33.890 so we're going to have to clean up this picture alone and somehow 00:06:33.890 --> 00:06:36.680 figure out how to get users-- 00:06:36.680 --> 00:06:38.450 or more generally, traffic-- 00:06:38.450 --> 00:06:40.820 to both of these servers. 00:06:40.820 --> 00:06:43.740 I could just naively draw an arrow like this. 00:06:43.740 --> 00:06:45.380 But what does that actually mean? 00:06:45.380 --> 00:06:47.780 We don't want to abstract away so much of the detail 00:06:47.780 --> 00:06:50.460 that we're ignoring this problem. 00:06:50.460 --> 00:06:54.860 How do we implement this notion of choosing between left arrow and right 00:06:54.860 --> 00:06:55.580 arrow? 00:06:55.580 --> 00:06:59.210 Well, let's consider what our solutions might be. 00:06:59.210 --> 00:07:02.930 If a user, like me on my laptop, is trying to visit this web site-- 00:07:02.930 --> 00:07:06.740 and the web site, ideally, is going to live at something like example.com, 00:07:06.740 --> 00:07:10.122 or facebook.com, or gmail.com, or whatever-- 00:07:10.122 --> 00:07:12.830 I don't want to have to broadcast different names for my servers. 00:07:12.830 --> 00:07:14.910 And you might actually notice this on the internet. 00:07:14.910 --> 00:07:18.160 You might notice, if you start noticing the URLs of websites you're visiting-- 00:07:18.160 --> 00:07:21.290 especially for certain older, stodgier companies who haven't necessarily 00:07:21.290 --> 00:07:23.240 implemented this in the most modern way-- 00:07:23.240 --> 00:07:27.239 you might find yourself not just at www.something.com, 00:07:27.239 --> 00:07:29.780 but if you look closely, you might find yourself occasionally 00:07:29.780 --> 00:07:35.310 at www1.something.com, www2.something.com, 00:07:35.310 --> 00:07:38.590 or even www13.something.com. 00:07:38.590 --> 00:07:43.670 Which is to say that some companies appear to solve this problem by just 00:07:43.670 --> 00:07:45.140 giving different names-- 00:07:45.140 --> 00:07:48.920 similar names, but different names-- to their two servers, three servers, 00:07:48.920 --> 00:07:51.200 13 servers, or however many they have. 00:07:51.200 --> 00:07:54.770 And then they somehow redirect users from their main domain 00:07:54.770 --> 00:08:00.440 name, www.something.com, to any one of those two or three or 13 servers. 00:08:00.440 --> 00:08:01.919 But this isn't very elegant. 00:08:01.919 --> 00:08:03.710 The marketing folks would surely hate this, 00:08:03.710 --> 00:08:06.770 because you're trying to build some brand recognition around your URL. 00:08:06.770 --> 00:08:10.370 Why would you dirty it by just putting these arbitrary numbers in the URLs? 00:08:10.370 --> 00:08:13.940 Plus if you fast forward a bit in this story, 00:08:13.940 --> 00:08:16.610 if, for some reason down the road, you get fancier, 00:08:16.610 --> 00:08:18.382 bigger servers that can handle more users, 00:08:18.382 --> 00:08:20.090 and therefore you don't need 13 of them-- 00:08:20.090 --> 00:08:22.220 you can get away with just six of them-- 00:08:22.220 --> 00:08:24.950 well, what happens if some of your customers have bookmarked, 00:08:24.950 --> 00:08:30.320 very reasonably, one of those older names, like www13.something.com? 00:08:30.320 --> 00:08:33.799 So now when they try to visit that URL, gosh, they might hit a dead end. 00:08:33.799 --> 00:08:35.632 So you could solve that in some other way. 00:08:35.632 --> 00:08:38.090 But the point is it would seem to create a problem quickly, 00:08:38.090 --> 00:08:40.159 and it's just a naming mess. 00:08:40.159 --> 00:08:44.270 Why actually bother having your users see something 00:08:44.270 --> 00:08:46.020 as messy as these numbered servers? 00:08:46.020 --> 00:08:49.400 It would be nice to do this a little more transparently. 00:08:49.400 --> 00:08:50.930 So how could we do this? 00:08:50.930 --> 00:08:54.140 Well, let me propose that we kind of need some middleman here, 00:08:54.140 --> 00:08:57.740 so to speak, whereby traffic comes from people like me on the internet 00:08:57.740 --> 00:09:00.400 and then either goes to the left or goes to the right, 00:09:00.400 --> 00:09:02.780 or no matter how many servers we have, goes 00:09:02.780 --> 00:09:05.750 to one of those actual web servers. 00:09:05.750 --> 00:09:09.500 So how does this middleman-- and to borrow some past terminology, 00:09:09.500 --> 00:09:12.320 how does this black box potentially work? 00:09:12.320 --> 00:09:14.030 Well, let's consider some of the building 00:09:14.030 --> 00:09:18.530 blocks, some of the puzzle pieces we have technologically at our disposal 00:09:18.530 --> 00:09:19.400 now. 00:09:19.400 --> 00:09:21.770 You may recall that every server on the internet 00:09:21.770 --> 00:09:25.641 has an IP address, an internet protocol address, a unique address for it. 00:09:25.641 --> 00:09:27.890 And that's, again, a bit of a white lie, because there 00:09:27.890 --> 00:09:30.410 are technologies by which you can have private IP 00:09:30.410 --> 00:09:32.690 addresses that the outside world doesn't see. 00:09:32.690 --> 00:09:35.780 But let's stipulate, for today's purposes, 00:09:35.780 --> 00:09:38.810 that every computer on the internet certainly has an IP 00:09:38.810 --> 00:09:41.550 address, whether public or private. 00:09:41.550 --> 00:09:46.400 So maybe, just maybe, we could leverage an existing technology-- 00:09:46.400 --> 00:09:48.680 DNS, the Domain Name System-- 00:09:48.680 --> 00:09:52.880 so that rather than only return one IP address of a server 00:09:52.880 --> 00:09:57.470 when you look up www.something.com, we return the IP address 00:09:57.470 --> 00:09:59.610 of the server on the left some of the time 00:09:59.610 --> 00:10:02.990 or the IP address of the server on the right some of the time, 00:10:02.990 --> 00:10:07.160 effectively balancing our load, our traffic across the two servers. 00:10:07.160 --> 00:10:10.060 And in fact, if you do this 50-50, you can 00:10:10.060 --> 00:10:12.640 take, really, what's called a round robin approach, 00:10:12.640 --> 00:10:17.270 and ideally uniformly distribute your traffic across multiple servers. 00:10:17.270 --> 00:10:20.140 And what's nice in this model is that because you're using DNS, 00:10:20.140 --> 00:10:22.472 the user doesn't really notice what's going on. 00:10:22.472 --> 00:10:24.430 At the end of the day, none of us humans really 00:10:24.430 --> 00:10:27.130 care what IP address we're actually going to if we visit 00:10:27.130 --> 00:10:29.410 Facebook.com or Gmail.com or the like. 00:10:29.410 --> 00:10:33.730 We just care that our computer can find that server or servers on the internet. 00:10:33.730 --> 00:10:38.020 So via DNS, we could, very cleverly, via this middleman here, 00:10:38.020 --> 00:10:42.010 which is really just going to be some third device, some separate server-- 00:10:42.010 --> 00:10:45.940 it, as a DNS device, could just respond to requests 00:10:45.940 --> 00:10:49.630 from customers with either this IP address or this IP address, 00:10:49.630 --> 00:10:52.840 or any number of different IP addresses. 00:10:52.840 --> 00:10:56.130 So does this solve the problem? 00:10:56.130 --> 00:10:58.020 Again, most everything in computer science 00:10:58.020 --> 00:11:01.200 would seem to be a tradeoff at the end of the day. 00:11:01.200 --> 00:11:03.690 And this seems almost too good to be true, perhaps. 00:11:03.690 --> 00:11:04.560 It's so simple. 00:11:04.560 --> 00:11:06.270 It leverages an existing technology. 00:11:06.270 --> 00:11:07.750 It just works. 00:11:07.750 --> 00:11:10.440 So what prices might we pay? 00:11:10.440 --> 00:11:14.325 Well, DNS, it turns out, gets cached quite a bit. 00:11:14.325 --> 00:11:15.450 And what does caching mean? 00:11:15.450 --> 00:11:18.626 Caching something means keeping some past answer-- 00:11:18.626 --> 00:11:20.750 or more generally, piece of information-- around so 00:11:20.750 --> 00:11:26.350 that you can access it more quickly the second and the third time and beyond. 00:11:26.350 --> 00:11:30.360 And so computers today, Macs and PCs, as well as 00:11:30.360 --> 00:11:33.480 servers on the internet, other DNS servers on the internet, 00:11:33.480 --> 00:11:37.170 for performance reasons, will often remember the responses 00:11:37.170 --> 00:11:38.910 that they get from DNS servers. 00:11:38.910 --> 00:11:42.690 For instance, if, on my Mac, I visit Facebook.com, hypothetically 00:11:42.690 --> 00:11:47.370 a lot of times during the day, it's kind of stupid if my laptop, again and again 00:11:47.370 --> 00:11:49.620 and again and again, asks some DNS server 00:11:49.620 --> 00:11:52.110 for Facebook.com's IP address if it already 00:11:52.110 --> 00:11:55.020 asked that same question an hour ago-- or more realistically, 00:11:55.020 --> 00:11:57.390 two minutes ago, or something like that. 00:11:57.390 --> 00:12:01.590 It would be smarter if my operating system-- or even my browser, Chrome 00:12:01.590 --> 00:12:03.540 or Firefox or whatever I'm using-- actually 00:12:03.540 --> 00:12:08.210 remembers that answer for me so that my computer can just pull up that web 00:12:08.210 --> 00:12:13.740 site faster by skipping a step, by not wasting time asking a server again 00:12:13.740 --> 00:12:15.750 for the IP address of a server. 00:12:15.750 --> 00:12:19.860 And after all, IP addresses, it turns out, generally don't change that often. 00:12:19.860 --> 00:12:23.880 It's certainly possible for a company or a university or even a home user 00:12:23.880 --> 00:12:25.840 to change their computer's IP addresses. 00:12:25.840 --> 00:12:28.350 But the reality is it doesn't change all that often. 00:12:28.350 --> 00:12:30.490 The common case is to have the same IP address 00:12:30.490 --> 00:12:33.960 now as you might an hour from now, or even a day or a week or a month 00:12:33.960 --> 00:12:34.860 from now. 00:12:34.860 --> 00:12:37.530 But the key thing is that it can change. 00:12:37.530 --> 00:12:41.600 And especially if you're worried about customers-- not just some personal web 00:12:41.600 --> 00:12:43.260 site, but you might lose business. 00:12:43.260 --> 00:12:45.960 You might lose orders if users can't visit your website. 00:12:45.960 --> 00:12:49.470 Anything that puts your server's uptime, so to speak-- 00:12:49.470 --> 00:12:51.480 being accessible on the internet at risk-- 00:12:51.480 --> 00:12:53.790 probably is worthy of some consideration. 00:12:53.790 --> 00:12:57.884 So let me propose, then, that just one of these servers goes offline somehow. 00:12:57.884 --> 00:12:58.800 Maybe it's deliberate. 00:12:58.800 --> 00:13:00.330 You need to do some service for it. 00:13:00.330 --> 00:13:03.270 Or maybe it crashed in some way, or it got unplugged somehow, 00:13:03.270 --> 00:13:07.530 or something went wrong such that now, one or more of your servers, 00:13:07.530 --> 00:13:11.940 across which you've been load balancing, no longer can talk to the internet. 00:13:11.940 --> 00:13:12.960 What might happen? 00:13:12.960 --> 00:13:15.510 Well, if some customer's Mac, like my own, 00:13:15.510 --> 00:13:20.310 has remembered or cached that particular server's IP address, 00:13:20.310 --> 00:13:21.870 that is not a good situation. 00:13:21.870 --> 00:13:24.150 Because your Mac or PC or whatever is going 00:13:24.150 --> 00:13:27.450 to now try to revisit your web site again and again 00:13:27.450 --> 00:13:33.660 and again at that old cached IP address that apparently can be a dead end. 00:13:33.660 --> 00:13:38.370 And so even though you still have servers that could potentially 00:13:38.370 --> 00:13:41.550 handle that customer's request, that customer's order, 00:13:41.550 --> 00:13:44.520 that customer's desire to check out, he or she 00:13:44.520 --> 00:13:46.650 really is still not going to be able to visit 00:13:46.650 --> 00:13:49.290 the website unless that cache expires. 00:13:49.290 --> 00:13:52.230 Maybe they reboot their computer so that the cache forcibly expires. 00:13:52.230 --> 00:13:55.140 Maybe they just wait some amount of time so that that IP address 00:13:55.140 --> 00:13:57.510 is forgotten by the browser or by the operating system 00:13:57.510 --> 00:14:01.950 or by some other DNS server until the new one's available IP 00:14:01.950 --> 00:14:03.690 addresses are picked up instead. 00:14:03.690 --> 00:14:04.830 But there is that risk. 00:14:04.830 --> 00:14:07.470 And I would argue that this risk is even higher especially 00:14:07.470 --> 00:14:11.910 for companies that might be considering moving their infrastructure from one 00:14:11.910 --> 00:14:13.200 service to another. 00:14:13.200 --> 00:14:16.890 If you're deliberately going to move your servers from one IP address 00:14:16.890 --> 00:14:20.010 to another, as might happen if you change cloud providers, so to speak-- 00:14:20.010 --> 00:14:21.240 more on those in a minute-- 00:14:21.240 --> 00:14:24.790 really, if you change the companies that you're using to host your servers, 00:14:24.790 --> 00:14:26.460 your IP addresses will change. 00:14:26.460 --> 00:14:29.310 And you certainly don't want to incur a huge amount of downtime 00:14:29.310 --> 00:14:30.550 in a situation like that. 00:14:30.550 --> 00:14:32.130 So there are these tradeoffs. 00:14:32.130 --> 00:14:35.040 Easy solution, technologically pretty inexpensive to do. 00:14:35.040 --> 00:14:37.840 It just works using existing technology. 00:14:37.840 --> 00:14:40.960 But you open up yourselves to this risk. 00:14:40.960 --> 00:14:42.300 So let's address that. 00:14:42.300 --> 00:14:44.837 Putting back the old proverbial engineering hat, 00:14:44.837 --> 00:14:46.170 let's try to solve this problem. 00:14:46.170 --> 00:14:48.870 It seems that giving a unique IP address to this server 00:14:48.870 --> 00:14:52.110 and to this server, and any number of other servers that are back there, 00:14:52.110 --> 00:14:56.400 might not be the smartest idea in so far as those IPs can get cached. 00:14:56.400 --> 00:15:01.120 So what if we use DNS as follows? 00:15:01.120 --> 00:15:05.220 When my laptop or anyone else's requests the IP address for www.something.com, 00:15:05.220 --> 00:15:08.860 why don't we return the IP address of this device here-- 00:15:08.860 --> 00:15:11.680 this load balancer, as we'll start calling it, 00:15:11.680 --> 00:15:15.360 where a load balancer is usually just a physical device, 00:15:15.360 --> 00:15:18.840 or multiple physical devices, whose purpose in life is to balance load? 00:15:18.840 --> 00:15:22.140 Packets come in, and similar in spirit to a router, 00:15:22.140 --> 00:15:25.860 they do route information to the left, to the right, or some other direction. 00:15:25.860 --> 00:15:29.910 But their overarching purpose isn't just to get data from point A to point B, 00:15:29.910 --> 00:15:32.970 but to somehow intelligently balance that traffic 00:15:32.970 --> 00:15:37.860 over multiple possible destinations for point B, identical servers 00:15:37.860 --> 00:15:39.390 in the case of our story here. 00:15:39.390 --> 00:15:42.810 So what if, instead, we addressed this problem of potential downtime 00:15:42.810 --> 00:15:46.830 by returning the IP address of the load balancer, 00:15:46.830 --> 00:15:49.560 and then, by nature of private IP addresses 00:15:49.560 --> 00:15:52.110 or some other mechanism that the end user does not 00:15:52.110 --> 00:15:56.920 need to know or care about, this load balancer somehow routes the traffic 00:15:56.920 --> 00:16:00.110 to either the first device or the second device, 00:16:00.110 --> 00:16:03.640 LB here being our load balancer? 00:16:03.640 --> 00:16:05.860 So we've seemed to have solved this problem. 00:16:05.860 --> 00:16:08.920 In so far as now we have configured our DNS servers 00:16:08.920 --> 00:16:12.250 to return the IP address of the load balancer, 00:16:12.250 --> 00:16:15.760 there's no problem of downtime as we described a moment ago. 00:16:15.760 --> 00:16:20.890 Because if Server 1 goes offline for whatever reason, no big deal. 00:16:20.890 --> 00:16:25.510 The load balancer should hopefully just notice that and subsequently start 00:16:25.510 --> 00:16:29.710 proactively routing all incoming data that reaches its IP address to Server 2 00:16:29.710 --> 00:16:30.940 and not Server 1. 00:16:30.940 --> 00:16:32.819 now how does the load balancer know? 00:16:32.819 --> 00:16:34.360 Well, either a human could intervene. 00:16:34.360 --> 00:16:37.240 Maybe someone gets a late night call or text or page saying, 00:16:37.240 --> 00:16:39.490 uh oh, server 1 is down, you better do something. 00:16:39.490 --> 00:16:42.070 And then he or she can manually configure the load balancer 00:16:42.070 --> 00:16:44.590 to no longer send any traffic to Server 1. 00:16:44.590 --> 00:16:47.620 That seems kind of stupid in an age of automation and smart software. 00:16:47.620 --> 00:16:48.730 Maybe we can do better. 00:16:48.730 --> 00:16:49.870 And indeed, we can. 00:16:49.870 --> 00:16:52.630 A technique that's often used by servers is 00:16:52.630 --> 00:16:55.570 something modeled from the human world to use 00:16:55.570 --> 00:16:58.690 what you might describe as heartbeats to actually configure 00:16:58.690 --> 00:17:02.350 the load balancer and Servers 1 and 2 to operate as follows. 00:17:02.350 --> 00:17:05.740 Maybe every second, every half a second, maybe every five seconds 00:17:05.740 --> 00:17:10.359 you configure Server 1 and Server 2 to send some kind of heartbeat message 00:17:10.359 --> 00:17:11.750 to the load balancer. 00:17:11.750 --> 00:17:14.770 This is just a TCP/IP packet, some kind of network packet 00:17:14.770 --> 00:17:17.650 that's the equivalent of saying I'm alive. 00:17:17.650 --> 00:17:18.339 I'm alive. 00:17:18.339 --> 00:17:23.770 Or more goofily, like boom, boom, boom, boom, ergo the heartbeat metaphor. 00:17:23.770 --> 00:17:26.680 But the point is that 1 and 2, and any number of other servers, 00:17:26.680 --> 00:17:29.710 should be configured to just constantly reassure 00:17:29.710 --> 00:17:31.800 the load balancer that they are alive. 00:17:31.800 --> 00:17:32.770 They are accessible. 00:17:32.770 --> 00:17:34.780 They are ready to receive traffic. 00:17:34.780 --> 00:17:36.809 And the load balancer, similarly-- 00:17:36.809 --> 00:17:39.100 and you might see where this is going-- can very simply 00:17:39.100 --> 00:17:42.100 be configured to listen for that heartbeat. 00:17:42.100 --> 00:17:46.060 And if it ever doesn't hear a heartbeat from Server 1 or Server 2, 00:17:46.060 --> 00:17:48.790 it should just assume that something is wrong. 00:17:48.790 --> 00:17:50.030 The server has died. 00:17:50.030 --> 00:17:50.830 It's gone offline. 00:17:50.830 --> 00:17:52.390 Something bad has happened. 00:17:52.390 --> 00:17:55.090 So the load balancer subsequently should simply not 00:17:55.090 --> 00:17:58.030 route any traffic to that particular server 00:17:58.030 --> 00:18:00.490 until some human or some automated process 00:18:00.490 --> 00:18:04.300 brings the server back alive, so to speak, and the heartbeat resumes. 00:18:04.300 --> 00:18:06.880 Now, of course, this problem doesn't go away permanently. 00:18:06.880 --> 00:18:09.630 If servers 1 and 2 stop emitting a heartbeat, 00:18:09.630 --> 00:18:11.447 we really have no capacity for users. 00:18:11.447 --> 00:18:13.030 But that would be an extreme scenario. 00:18:13.030 --> 00:18:17.714 Hopefully it's just one or a few of our servers go offline in that way. 00:18:17.714 --> 00:18:20.380 So we can configure our servers for these heartbeats, which is-- 00:18:20.380 --> 00:18:21.190 think about it-- 00:18:21.190 --> 00:18:25.100 a very simple physiologically-inspired solution to a problem. 00:18:25.100 --> 00:18:27.730 And even if it's not obvious how you implemented it in code, 00:18:27.730 --> 00:18:30.370 it really is just an algorithm, a simple set of instructions 00:18:30.370 --> 00:18:33.700 with which we can solve this problem. 00:18:33.700 --> 00:18:36.690 And yet, damnit, we've introduced a new problem. 00:18:36.690 --> 00:18:40.050 And so this really is the old leaky hose, 00:18:40.050 --> 00:18:43.150 where just as we've plugged one leak or solved one problem, 00:18:43.150 --> 00:18:46.420 another one has sprung up somewhere else along the line. 00:18:46.420 --> 00:18:48.850 So what's the problem now? 00:18:48.850 --> 00:18:49.870 What's the problem now? 00:18:49.870 --> 00:18:54.520 The whole motivation of introducing Server Number 2, in addition 00:18:54.520 --> 00:18:58.930 to Server Number 1, was to make sure that we have enough capacity, 00:18:58.930 --> 00:19:03.070 and better yet, to make sure that if Server 1 or Server 2 goes offline, 00:19:03.070 --> 00:19:05.650 the other one can hopefully pick up the load unless it's 00:19:05.650 --> 00:19:09.320 a super busy time with lots and lots of users visiting all at once. 00:19:09.320 --> 00:19:12.070 So in fact, the general idea at play here 00:19:12.070 --> 00:19:20.770 is high availability ensuring that if one server goes down, 00:19:20.770 --> 00:19:22.930 you have other servers that can pick up the load. 00:19:22.930 --> 00:19:26.419 Being highly available means you can be tolerant to issues like that. 00:19:26.419 --> 00:19:28.210 And then load balancing, of course, is just 00:19:28.210 --> 00:19:31.390 the mere process of splitting the load across those two endpoints. 00:19:31.390 --> 00:19:34.390 But we have introduced another problem. 00:19:34.390 --> 00:19:44.710 This might be abbreviated SPOF, or more explicitly, Single Point Of Failure. 00:19:44.710 --> 00:19:48.040 Just as I've solved one problem by introducing this load balancer, 00:19:48.040 --> 00:19:50.800 so have I introduced a new problem, which is this. 00:19:50.800 --> 00:19:52.870 There is now, as you might infer from the name 00:19:52.870 --> 00:19:54.730 alone, a single point of failure. 00:19:54.730 --> 00:19:59.050 It's fine that I can now tolerate Server 1 or Server 2 going down, 00:19:59.050 --> 00:20:01.920 but what can I not tolerate, clearly? 00:20:01.920 --> 00:20:04.430 What if the load balancer goes down? 00:20:04.430 --> 00:20:06.160 So this is a very real concern. 00:20:06.160 --> 00:20:09.071 Maybe the load balancer itself gets overloaded. 00:20:09.071 --> 00:20:11.320 Maybe the load balancer itself has some kind of issue. 00:20:11.320 --> 00:20:13.270 And if the load balancer goes down, it doesn't 00:20:13.270 --> 00:20:16.120 matter how many web servers I have down here, 00:20:16.120 --> 00:20:19.900 or how much money I've spent down here to ensure my high availability. 00:20:19.900 --> 00:20:25.100 My server is offline if this single point of failure indeed fails. 00:20:25.100 --> 00:20:27.640 Now, you'd like to think that the load balancer-- 00:20:27.640 --> 00:20:30.070 especially since it only has one job in life-- 00:20:30.070 --> 00:20:33.270 can at least handle more traffic than any individual server. 00:20:33.270 --> 00:20:36.490 Indeed, clearly, it must be the case that the load balancer 00:20:36.490 --> 00:20:39.490 is fast enough and capable enough to handle twice as 00:20:39.490 --> 00:20:41.840 much traffic as any individual server. 00:20:41.840 --> 00:20:46.999 But that's generally accepted as feasible insofar as your website. 00:20:46.999 --> 00:20:48.790 Your real intellectual property is probably 00:20:48.790 --> 00:20:50.590 doing a lot of work-- talking to a database, 00:20:50.590 --> 00:20:53.506 writing out files, downloading things, or any number of other features 00:20:53.506 --> 00:20:56.800 that just take more effort than just routing data from one server 00:20:56.800 --> 00:20:58.990 to another as a load balancer does. 00:20:58.990 --> 00:21:00.970 But it doesn't matter how performant it is. 00:21:00.970 --> 00:21:04.240 If the load balancer breaks, goes offline for some reason, 00:21:04.240 --> 00:21:08.120 your entire infrastructure is inaccessible. 00:21:08.120 --> 00:21:09.980 So how do we solve this? 00:21:09.980 --> 00:21:13.510 How do we go about and architect a solution to this? 00:21:13.510 --> 00:21:15.910 Well, how did we address this issue earlier? 00:21:15.910 --> 00:21:20.260 We addressed the issue of insufficient capacity or potential downtime 00:21:20.260 --> 00:21:22.960 by just throwing hardware at the problem. 00:21:22.960 --> 00:21:25.940 And so maybe we could do that same thing here. 00:21:25.940 --> 00:21:29.260 Maybe we could just introduce a second load balancer. 00:21:29.260 --> 00:21:31.540 I'll call this LB as well. 00:21:31.540 --> 00:21:33.940 And now we somehow have to-- 00:21:33.940 --> 00:21:39.640 I feel like we're just endlessly going to be adding more and more rectangles 00:21:39.640 --> 00:21:40.540 to the picture. 00:21:40.540 --> 00:21:46.480 But somehow, we need to be able to load balance across now two servers and two 00:21:46.480 --> 00:21:47.980 load balancers. 00:21:47.980 --> 00:21:48.860 So how do we do this? 00:21:48.860 --> 00:21:52.660 Well, let me clean this up so that we have a bit more room to play with here 00:21:52.660 --> 00:21:57.260 and consider how a pair of load balancers might actually work. 00:21:57.260 --> 00:22:01.510 So if my first server is here and my second server is here, 00:22:01.510 --> 00:22:07.720 and I'm proposing now to have two load balancers-- one here and one here-- 00:22:07.720 --> 00:22:12.460 surely, both of these have to be able to talk to both servers. 00:22:12.460 --> 00:22:15.100 So we already have this necessity. 00:22:15.100 --> 00:22:18.820 And somehow, traffic has to come from the internet 00:22:18.820 --> 00:22:23.497 into this set of load balancers, but probably only to one, 00:22:23.497 --> 00:22:25.330 because we don't want to solve this with DNS 00:22:25.330 --> 00:22:27.370 and just have two IP addresses out there. 00:22:27.370 --> 00:22:30.160 Because if one breaks, we can recreate the same problem 00:22:30.160 --> 00:22:32.090 as before if we're not careful. 00:22:32.090 --> 00:22:33.140 So what if we do this? 00:22:33.140 --> 00:22:37.390 What if we use this building block of heartbeats in another way as well? 00:22:37.390 --> 00:22:40.600 What if we ensure that our load balancers-- 00:22:40.600 --> 00:22:45.740 plural-- have just one IP address, which a moment ago seemed 00:22:45.740 --> 00:22:47.240 to create a single point of failure? 00:22:47.240 --> 00:22:48.590 But what if we do this? 00:22:48.590 --> 00:22:52.330 What if we also allow the load balancers to talk to, 00:22:52.330 --> 00:22:57.940 to communicate over a network with each other so that one of the load balancers 00:22:57.940 --> 00:23:00.940 is constantly saying to the other, I'm alive. 00:23:00.940 --> 00:23:02.020 I'm alive. 00:23:02.020 --> 00:23:03.290 I'm alive. 00:23:03.290 --> 00:23:06.310 And so what the load balancers could be configured to do 00:23:06.310 --> 00:23:10.400 is that only one of them operates at any given point in time. 00:23:10.400 --> 00:23:14.830 But if the other server, the other load balancer, 00:23:14.830 --> 00:23:19.330 no longer hears from that primary load balancer because of the heartbeats 00:23:19.330 --> 00:23:21.790 that are ideally both being emitted in both directions 00:23:21.790 --> 00:23:25.150 so that they can both be assured of the other's up time-- 00:23:25.150 --> 00:23:29.110 if the secondary load balancer stops hearing the primary load balancer, 00:23:29.110 --> 00:23:32.560 the secondary load balancer can just presumptuously 00:23:32.560 --> 00:23:37.050 reconfigure itself to take on that one and only IP address, 00:23:37.050 --> 00:23:39.760 effectively assuming that the first load balancer is not going 00:23:39.760 --> 00:23:41.740 to be responding to any traffic anyway. 00:23:41.740 --> 00:23:46.120 And the second load balancer can simply take on the entire load itself. 00:23:46.120 --> 00:23:49.750 But the key difference now in this particular solution 00:23:49.750 --> 00:23:53.920 is that there's only one IP address that describes this whole architecture, only 00:23:53.920 --> 00:23:56.740 one IP address between the two load balancers 00:23:56.740 --> 00:24:01.210 so we don't risk those potential dead ends that we had a little bit ago 00:24:01.210 --> 00:24:03.710 with our back end servers. 00:24:03.710 --> 00:24:08.884 So now it's starting to get more robust, more highly available. 00:24:08.884 --> 00:24:09.800 So that's pretty good. 00:24:09.800 --> 00:24:11.800 We've solved most of these problems. 00:24:11.800 --> 00:24:17.590 We've generously, though, swept one problem underneath the rug, whereby 00:24:17.590 --> 00:24:20.217 every time I draw another rectangle-- 00:24:20.217 --> 00:24:22.300 not just the first time, but now the second time-- 00:24:22.300 --> 00:24:26.200 and add some interconnectivity, somehow, among them someone 00:24:26.200 --> 00:24:27.850 somewhere is spending some money. 00:24:27.850 --> 00:24:30.340 And indeed, I am solving these problems thus far 00:24:30.340 --> 00:24:33.800 by throwing money at the problem, and frankly introducing complexity. 00:24:33.800 --> 00:24:36.400 Already look at how many arrows or edges there 00:24:36.400 --> 00:24:40.060 are now, which might simply refer to physical wires, which is fine. 00:24:40.060 --> 00:24:43.990 But there's also a logical configuration that's now necessary. 00:24:43.990 --> 00:24:47.530 And God forbid we have a third load balancer for extra high availability 00:24:47.530 --> 00:24:49.420 or any number of servers here-- 00:24:49.420 --> 00:24:52.510 13 or 20 or 100 or 1,000 servers. 00:24:52.510 --> 00:24:54.910 It's a lot of cross-connections-- not just physically, 00:24:54.910 --> 00:24:58.120 but logically in terms of the requisite configuration. 00:24:58.120 --> 00:25:01.540 So this complexity does add up. 00:25:01.540 --> 00:25:04.690 And the cost certainly adds up. 00:25:04.690 --> 00:25:07.240 And now, once upon a time-- and not all that 00:25:07.240 --> 00:25:11.750 long ago-- if a company wanted to architect this kind of solution, 00:25:11.750 --> 00:25:14.590 you would literally buy two load balancers, 00:25:14.590 --> 00:25:17.290 and you would buy two or more web servers, 00:25:17.290 --> 00:25:19.690 and you would buy the requisite physical ethernet 00:25:19.690 --> 00:25:21.070 cables to interconnect the two. 00:25:21.070 --> 00:25:23.320 And you'd probably buy a whole bunch of other hardware 00:25:23.320 --> 00:25:26.279 that we've not even talked about, like firewalls and switches and more. 00:25:26.279 --> 00:25:28.361 But you would physically buy all of this hardware. 00:25:28.361 --> 00:25:30.520 You would physically connect all of this hardware 00:25:30.520 --> 00:25:35.710 and configure it to implement these several kinds of features. 00:25:35.710 --> 00:25:38.410 But the catch is that the more and more hardware 00:25:38.410 --> 00:25:42.310 you buy, just probabilistically, the more and more you 00:25:42.310 --> 00:25:44.320 invite some kind of failure. 00:25:44.320 --> 00:25:46.000 Maybe it's some stupid human error. 00:25:46.000 --> 00:25:49.310 But more realistically, one of your hard drives is going to fail. 00:25:49.310 --> 00:25:52.960 And hard drives are typically rated for the enterprise in terms of Mean Time 00:25:52.960 --> 00:25:57.230 Between Failure, MTBF, which generally means 00:25:57.230 --> 00:26:01.100 how long should you expect a hard drive to work on average before it fails. 00:26:01.100 --> 00:26:01.600 It breaks. 00:26:01.600 --> 00:26:02.900 It just stops working. 00:26:02.900 --> 00:26:05.500 So if you have a whole bunch of servers, each of which 00:26:05.500 --> 00:26:08.390 has a whole bunch of hard drives, at some point, 00:26:08.390 --> 00:26:11.862 combinatorially, one or more of those drives is just going to fail, 00:26:11.862 --> 00:26:13.820 which is to say you're going to have a problem, 00:26:13.820 --> 00:26:15.850 and you're going to have to fix it yourself. 00:26:15.850 --> 00:26:19.940 At some point, too, you're going to run out of physical space. 00:26:19.940 --> 00:26:22.850 In fact, perhaps one of the most constraining resources, 00:26:22.850 --> 00:26:25.530 especially for startups, is the physical space itself. 00:26:25.530 --> 00:26:28.780 You probably don't want to start housing your servers in your physical office, 00:26:28.780 --> 00:26:32.530 because you need a special room for it, typically, with enough cooling, 00:26:32.530 --> 00:26:36.750 with enough access, with enough electricity, and enough humans 00:26:36.750 --> 00:26:37.750 to actually maintain it. 00:26:37.750 --> 00:26:41.345 Or you graduate from your own office space and go to a data center, 00:26:41.345 --> 00:26:44.230 a co-location facility, whereby you maybe 00:26:44.230 --> 00:26:47.500 rent space in a physical cage with a locking door, 00:26:47.500 --> 00:26:49.690 inside of which you put racks of servers, 00:26:49.690 --> 00:26:54.100 just racked up on big metal poles, and you pack as many servers in there 00:26:54.100 --> 00:26:54.970 as you can. 00:26:54.970 --> 00:26:57.610 But at some point, you're going to be bumping up 00:26:57.610 --> 00:27:03.340 against other constrained resources-- physical space, actual power capacity, 00:27:03.340 --> 00:27:07.220 cooling, as well as the humans to actually run this. 00:27:07.220 --> 00:27:10.720 And so very quickly does operations, ops, 00:27:10.720 --> 00:27:14.890 so to speak, become an increasing cost and an increasing challenge. 00:27:14.890 --> 00:27:19.130 And one of the most alluring features of the cloud, so to speak, 00:27:19.130 --> 00:27:23.350 is that you can move all of these details off-site. 00:27:23.350 --> 00:27:28.710 And you can abstract many of these, let's say, implementation details 00:27:28.710 --> 00:27:32.770 away whereby you yourself don't have to worry about the physical wires. 00:27:32.770 --> 00:27:35.260 You don't have to worry about the make and model of servers 00:27:35.260 --> 00:27:36.051 that you're buying. 00:27:36.051 --> 00:27:39.700 You don't have to worry about things actually breaking, 00:27:39.700 --> 00:27:42.790 because someone else will deal with that for you. 00:27:42.790 --> 00:27:46.000 But you have to still understand the topology and the architecture 00:27:46.000 --> 00:27:51.080 and the features that you want to implement so that you can actually 00:27:51.080 --> 00:27:53.130 configure them in the cloud. 00:27:53.130 --> 00:27:55.830 So what do you actually get from cloud providers? 00:27:55.830 --> 00:27:57.830 There's any number of them out there these days. 00:27:57.830 --> 00:28:01.760 But perhaps three of the biggest are Amazon, Google, and Microsoft, 00:28:01.760 --> 00:28:05.087 all of whom offer, these days, of very similar palettes of options. 00:28:05.087 --> 00:28:06.920 And it's outright overwhelming, if you visit 00:28:06.920 --> 00:28:10.290 each of their web sites, just how many cloud products they offer. 00:28:10.290 --> 00:28:13.610 But they would generally offer a number of standard products 00:28:13.610 --> 00:28:16.740 in the cloud-- for instance, a virtualized server. 00:28:16.740 --> 00:28:19.430 So you don't have to physically buy a server these days 00:28:19.430 --> 00:28:22.970 and plug it into your own ethernet connection, your own internet 00:28:22.970 --> 00:28:24.470 connection in your own office. 00:28:24.470 --> 00:28:27.320 You can instead essentially rent a server 00:28:27.320 --> 00:28:29.550 in the cloud, which is to say that Amazon, Google, 00:28:29.550 --> 00:28:31.520 Microsoft, or any number of other companies 00:28:31.520 --> 00:28:34.850 will host that server physically for you, 00:28:34.850 --> 00:28:37.520 and they will take care of the issues of power and cooling. 00:28:37.520 --> 00:28:40.061 And if a hard drive fails, they will go remove the old one 00:28:40.061 --> 00:28:41.060 and plug in the new one. 00:28:41.060 --> 00:28:43.610 And ideally, they will provide you with backup services. 00:28:43.610 --> 00:28:46.970 But more sophisticated than that, they can also 00:28:46.970 --> 00:28:52.280 help us recreate, in software, this kind of topology. 00:28:52.280 --> 00:28:56.750 In other words, even without having a human physically wire together 00:28:56.750 --> 00:28:59.390 this kind of graph, so to speak, that we've been building up 00:28:59.390 --> 00:29:02.600 here logically, thanks to software these days, 00:29:02.600 --> 00:29:06.260 you can implement this whole paradigm-- 00:29:06.260 --> 00:29:08.720 not with physical cables, not with physical devices, 00:29:08.720 --> 00:29:10.610 but with software virtually. 00:29:10.610 --> 00:29:11.460 What does that mean? 00:29:11.460 --> 00:29:13.640 It means that humans, over the past several years, 00:29:13.640 --> 00:29:17.810 have been writing software that mimics the behavior of physical servers. 00:29:17.810 --> 00:29:21.290 Humans have been writing software that mimics the behavior of a router. 00:29:21.290 --> 00:29:26.060 Humans have been writing software that mimics the behavior of a load balancer. 00:29:26.060 --> 00:29:30.470 And implementing mimics the behavior of-- really, we're just building, 00:29:30.470 --> 00:29:34.940 in software, what historically might have been implemented entirely 00:29:34.940 --> 00:29:35.682 in hardware. 00:29:35.682 --> 00:29:37.640 And even that's a bit of an oversimplification. 00:29:37.640 --> 00:29:40.077 Because even when something is bought as hardware, 00:29:40.077 --> 00:29:43.160 there is, of course, software running on that hardware that actually makes 00:29:43.160 --> 00:29:44.360 it do something. 00:29:44.360 --> 00:29:46.730 But they're no longer dedicated devices. 00:29:46.730 --> 00:29:50.750 You can use generic commodity PC server hardware, really, 00:29:50.750 --> 00:29:54.740 and transform that hardware into a certain role, a back end web 00:29:54.740 --> 00:29:58.550 server, a back end database, a load balancer, a router, a switch, 00:29:58.550 --> 00:30:00.180 any number of other things. 00:30:00.180 --> 00:30:02.930 And so what you were getting from companies like Amazon and Google 00:30:02.930 --> 00:30:06.230 and Microsoft and more is the ability to build up 00:30:06.230 --> 00:30:09.050 your infrastructure in software. 00:30:09.050 --> 00:30:16.190 In fact, the buzzword here, the acronym, is IaaS, Infrastructure as a Service. 00:30:16.190 --> 00:30:19.910 So you sign up for an account on any of those companies' cloud services web 00:30:19.910 --> 00:30:23.030 sites, and you put in your credit card information or your invoicing 00:30:23.030 --> 00:30:26.780 information, and you literally, via a command line tool-- so a keyboard, 00:30:26.780 --> 00:30:30.380 or via a nice, web-based graphical user interface, GUI-- 00:30:30.380 --> 00:30:34.907 do you point and click and say, give me two servers and one load balancer. 00:30:34.907 --> 00:30:36.740 Or if you have enough money in the bank, you 00:30:36.740 --> 00:30:40.040 say give me two servers and two load balancers 00:30:40.040 --> 00:30:41.990 configured for high availability. 00:30:41.990 --> 00:30:44.360 Or better yet, you don't say any of that. 00:30:44.360 --> 00:30:48.410 You just tell the provider, give me a web server and give me a load balancer, 00:30:48.410 --> 00:30:52.710 and you deal with the process of scaling those things as needed. 00:30:52.710 --> 00:30:56.360 In fact, a buzzword de jeur is auto scaling, which refers to a feature, 00:30:56.360 --> 00:30:59.720 implemented in software, whereby if a cloud 00:30:59.720 --> 00:31:03.740 provider notices that your servers are getting a lot of traffic-- 00:31:03.740 --> 00:31:05.880 business is good, or it's the holiday season, 00:31:05.880 --> 00:31:10.220 and you are bumping up against just how many users your one or two 00:31:10.220 --> 00:31:12.230 or three or more servers can handle-- 00:31:12.230 --> 00:31:17.030 auto-scaling is a feature that will enable the cloud provider to just turn 00:31:17.030 --> 00:31:21.320 on, virtually, more servers for you so that you go from two to three 00:31:21.320 --> 00:31:21.980 automatically. 00:31:21.980 --> 00:31:25.770 You can be happily asleep in the middle of the night, 00:31:25.770 --> 00:31:28.780 and even though your traffic is peaking, it doesn't matter. 00:31:28.780 --> 00:31:30.830 Your architecture is going to auto scale. 00:31:30.830 --> 00:31:33.320 And better yet-- especially financially-- 00:31:33.320 --> 00:31:37.820 if the cloud provider notices, maybe 12 hours later-- oh, all of your customers 00:31:37.820 --> 00:31:41.270 have gone to sleep, we don't really need all of this excess capacity. 00:31:41.270 --> 00:31:43.400 Or maybe the holidays are now in the past. 00:31:43.400 --> 00:31:45.200 You really don't need this excess capacity. 00:31:45.200 --> 00:31:50.090 Auto scaling also dictates that those servers can be virtually turned off. 00:31:50.090 --> 00:31:51.435 So you're no longer using them. 00:31:51.435 --> 00:31:53.060 You're no longer load bouncing to them. 00:31:53.060 --> 00:31:56.310 And most importantly, you're no longer paying for them. 00:31:56.310 --> 00:32:00.010 So this is a really, really nice value add at this point. 00:32:00.010 --> 00:32:02.780 There's no human crawling around on the floor rewiring things 00:32:02.780 --> 00:32:04.130 and plugging in new servers. 00:32:04.130 --> 00:32:07.460 There's no finance person having to approve the PO to actually order more 00:32:07.460 --> 00:32:09.220 servers just to increase your capacity. 00:32:09.220 --> 00:32:13.310 And most importantly, there is no latency between the time when 00:32:13.310 --> 00:32:16.580 you notice, oh, my god, we're getting really successful 00:32:16.580 --> 00:32:18.527 and can't handle our load-- uh oh. 00:32:18.527 --> 00:32:20.360 It's going to be a two, three-week lead time 00:32:20.360 --> 00:32:22.460 before we can even get in the more servers. 00:32:22.460 --> 00:32:25.910 Thanks to cloud computing, you can literally log in to Amazon's, Google's, 00:32:25.910 --> 00:32:28.210 Microsoft's web site and, click, click, click, 00:32:28.210 --> 00:32:32.360 have more server capacity within seconds, within minutes, 00:32:32.360 --> 00:32:37.790 far faster than the physical world traditionally allowed. 00:32:37.790 --> 00:32:40.460 So those are just some of the features now 00:32:40.460 --> 00:32:45.320 that we gain from outsourcing to the so-called cloud. 00:32:45.320 --> 00:32:48.510 So where does some of this capability come from? 00:32:48.510 --> 00:32:51.960 Well, it turns out that over the past many years, 00:32:51.960 --> 00:32:54.230 humans have been getting better and better and better 00:32:54.230 --> 00:32:58.460 at packing more physical hardware into the same form factor, 00:32:58.460 --> 00:32:59.720 into the same physical space. 00:32:59.720 --> 00:33:02.420 So at the level of CPUs, the brains of a computer, 00:33:02.420 --> 00:33:05.810 we humans have gotten much better at packing more and more transistors, 00:33:05.810 --> 00:33:07.520 for instance, onto a CPU. 00:33:07.520 --> 00:33:11.070 And transistors are the little switches that can turn things on and off-- 00:33:11.070 --> 00:33:12.670 0 and 1, 1 and 0. 00:33:12.670 --> 00:33:14.570 So you can store more information and you 00:33:14.570 --> 00:33:17.260 can do more with that information more quickly. 00:33:17.260 --> 00:33:19.820 CPUs today also have more cores, which you 00:33:19.820 --> 00:33:23.180 can think of as mini CPUs inside of the main CPU, 00:33:23.180 --> 00:33:25.550 so that a computer with multiple cores can literally 00:33:25.550 --> 00:33:28.280 do multiple things at a time. 00:33:28.280 --> 00:33:32.060 But the funny thing is that we humans, over the past decade or two, 00:33:32.060 --> 00:33:35.150 really haven't been getting fundamentally faster at life. 00:33:35.150 --> 00:33:38.090 At the end of the day, I can only check my email so quickly. 00:33:38.090 --> 00:33:39.990 I can only post on Facebook so quickly. 00:33:39.990 --> 00:33:43.670 I can only check out from Amazon so quickly. 00:33:43.670 --> 00:33:47.784 Because we humans have, of course, a finite speed to ourselves. 00:33:47.784 --> 00:33:49.950 We're not just getting-- we're not doubling in speed 00:33:49.950 --> 00:33:52.790 a la Moore's law every year or two. 00:33:52.790 --> 00:33:57.140 So we have, it would seem, a lot of excess computing capacity these days. 00:33:57.140 --> 00:34:00.080 Computers are getting so darn fast, we don't necessarily 00:34:00.080 --> 00:34:03.830 know what to do with all of these CPU cycles and with all of the RAM 00:34:03.830 --> 00:34:06.860 that we can fit into the same physical box at half the price 00:34:06.860 --> 00:34:08.989 that it cost us last year. 00:34:08.989 --> 00:34:12.469 And so manufacturers and companies realize 00:34:12.469 --> 00:34:17.179 that we could actually build a business on this increased capacity. 00:34:17.179 --> 00:34:23.277 We can implement the computer equivalent of timesharing, so to speak, 00:34:23.277 --> 00:34:25.610 which has long been with us in the history of computing. 00:34:25.610 --> 00:34:27.620 But we can do this on a much more massive scale 00:34:27.620 --> 00:34:33.679 now by taking one physical server that has maybe two CPUs, or 16 CPUs, 00:34:33.679 --> 00:34:38.570 or 64 CPUs, and maybe gigabytes-- 00:34:38.570 --> 00:34:41.090 tens of gigabytes or hundreds of gigabytes of RAM-- 00:34:41.090 --> 00:34:45.110 all inside of the same physical device, plug it in to an internet connection, 00:34:45.110 --> 00:34:50.810 and then run special software on that one server that creates the illusion 00:34:50.810 --> 00:34:54.920 that there's multiple servers living inside of that box. 00:34:54.920 --> 00:34:58.850 And this virtualization software is implemented 00:34:58.850 --> 00:35:02.480 by way of software called a virtual machine, or virtual machine monitor, 00:35:02.480 --> 00:35:04.430 or another word might be hypervisor. 00:35:04.430 --> 00:35:07.190 There's different ways to describe essentially the same thing. 00:35:07.190 --> 00:35:11.390 But a virtual machine is a piece of software 00:35:11.390 --> 00:35:15.590 running on a computer inside of which is running some other operating system, 00:35:15.590 --> 00:35:16.370 typically. 00:35:16.370 --> 00:35:19.070 So you might have one server running Windows. 00:35:19.070 --> 00:35:24.620 But inside of that server are multiple virtual machines, each of which 00:35:24.620 --> 00:35:25.880 itself is running Windows. 00:35:25.880 --> 00:35:29.509 So you might be able to chop up one computer into 10, or even into 100. 00:35:29.509 --> 00:35:31.550 Or perhaps more commonly, you might have a server 00:35:31.550 --> 00:35:34.340 running Linux or some Unix-based operating system, 00:35:34.340 --> 00:35:35.930 also with virtual machines on it. 00:35:35.930 --> 00:35:37.721 But those virtual machines might be running 00:35:37.721 --> 00:35:42.797 Linux themselves, or Unix, or Windows, or any number of versions of Windows. 00:35:42.797 --> 00:35:43.880 And so this is the beauty. 00:35:43.880 --> 00:35:48.080 When you have so much excess capacity and so many 00:35:48.080 --> 00:35:50.150 available CPU cycles and so much RAM, you 00:35:50.150 --> 00:35:56.490 can slice that up and then sell portions of the server's capacity to customers. 00:35:56.490 --> 00:36:01.310 And if you're really clever, you might look at your customers' usage patterns 00:36:01.310 --> 00:36:05.810 and realize that, you know what, it's not necessarily 00:36:05.810 --> 00:36:11.360 as simple as just taking my server and dividing it up into n different slices, 00:36:11.360 --> 00:36:13.700 where n is a generic variable for number, 00:36:13.700 --> 00:36:17.824 and then selling it or renting that space, really, to end customers. 00:36:17.824 --> 00:36:18.740 Because you know what? 00:36:18.740 --> 00:36:21.800 Some of those customers might have some booming businesses, which is great. 00:36:21.800 --> 00:36:24.110 But some of those customers might not have many users. 00:36:24.110 --> 00:36:26.129 Maybe it's a few dozen. 00:36:26.129 --> 00:36:27.170 Maybe it's a few hundred. 00:36:27.170 --> 00:36:29.280 But it's really a drop in the bucket. 00:36:29.280 --> 00:36:34.580 So instead of selling my computing resources to just end customers, 00:36:34.580 --> 00:36:37.730 maybe I'll sell it to twice as many customers or three times 00:36:37.730 --> 00:36:41.690 as many customers, and essentially over-sell my server's capacity, 00:36:41.690 --> 00:36:44.480 but expect that on average, this is just going 00:36:44.480 --> 00:36:47.570 to work out because some customers will be using a lot of those cycles 00:36:47.570 --> 00:36:49.460 because business is good, and some won't be, 00:36:49.460 --> 00:36:51.501 because it's just they don't have many customers, 00:36:51.501 --> 00:36:55.580 or really, it's a personal website that doesn't get much usage anyway. 00:36:55.580 --> 00:36:58.490 And so for some time, there has, of course, 00:36:58.490 --> 00:37:01.700 been this risk, when you sign up for a web hosting company or a cloud 00:37:01.700 --> 00:37:05.450 provider, that your web site actually might get really slow for reasons 00:37:05.450 --> 00:37:07.020 outside of your control. 00:37:07.020 --> 00:37:11.780 If you are co-located on a server that some other booming business is on, 00:37:11.780 --> 00:37:17.120 your users might actually suffer if your web host has oversold itself. 00:37:17.120 --> 00:37:19.249 And so in fact, this is one of those situations 00:37:19.249 --> 00:37:20.540 where you get what you pay for. 00:37:20.540 --> 00:37:23.990 If you're googling around and finding various cloud providers, 00:37:23.990 --> 00:37:26.360 or web hosting companies more specifically, 00:37:26.360 --> 00:37:30.050 you might be able to find a deal, like $10 per month or $50 per month, 00:37:30.050 --> 00:37:33.590 as opposed to $100 or $200 or more per month. 00:37:33.590 --> 00:37:37.070 And you do get what you pay for, because those fly-by-night operations that 00:37:37.070 --> 00:37:41.960 are selling you space and capacity super cheaply probably 00:37:41.960 --> 00:37:44.000 are overselling and over-committing. 00:37:44.000 --> 00:37:45.740 So these are the trade-offs, too-- 00:37:45.740 --> 00:37:48.590 how much money do you want to save versus how much risk 00:37:48.590 --> 00:37:50.300 do you actually want to take on? 00:37:50.300 --> 00:37:53.549 Generally, it's safer to go with some of the bigger fish these days, certainly 00:37:53.549 --> 00:37:57.500 when building a business, as you might on a company like Amazon or Google 00:37:57.500 --> 00:38:00.300 or Microsoft or derivatives thereof. 00:38:00.300 --> 00:38:02.540 So just to paint a more concrete technical picture 00:38:02.540 --> 00:38:06.534 of what virtualization is, here's a picture, as you might think of it. 00:38:06.534 --> 00:38:08.450 So you have your physical infrastructure here. 00:38:08.450 --> 00:38:12.080 So that's the actual server from Dell or IBM or whoever. 00:38:12.080 --> 00:38:14.870 Then you have the host operating system, which might be Windows, 00:38:14.870 --> 00:38:18.896 but is often Linux or some variant of Unix instead. 00:38:18.896 --> 00:38:20.270 And then you have the hypervisor. 00:38:20.270 --> 00:38:22.940 This is the piece of software that you install 00:38:22.940 --> 00:38:27.800 on your server that allows you to run multiple virtual machines on top of it. 00:38:27.800 --> 00:38:29.995 And those virtual machines can each run any number 00:38:29.995 --> 00:38:32.870 of different operating systems themselves, or even different versions 00:38:32.870 --> 00:38:34.130 of operating systems. 00:38:34.130 --> 00:38:38.359 And so depicted here up top are the disparate guest OS operating 00:38:38.359 --> 00:38:39.650 systems that might be on there. 00:38:39.650 --> 00:38:42.670 Maybe this is Linux and Solaris, and this is Windows itself, 00:38:42.670 --> 00:38:44.170 or any number of other combinations. 00:38:44.170 --> 00:38:47.870 Whatever your customers want or whatever you want to provide or essentially 00:38:47.870 --> 00:38:50.810 rent to customers, you can install. 00:38:50.810 --> 00:38:52.470 But you do pay a price. 00:38:52.470 --> 00:38:54.920 So as beautiful as this situation is, and as clever 00:38:54.920 --> 00:38:59.180 as it is that we're leveraging these excess resources by slicing up 00:38:59.180 --> 00:39:04.190 one server into the illusion of, in this case, three, or more generally more, 00:39:04.190 --> 00:39:05.780 there is some overhead. 00:39:05.780 --> 00:39:10.670 Because this hypervisor has to be a middleman between your guest operating 00:39:10.670 --> 00:39:13.562 systems and your host operating system, the one actually 00:39:13.562 --> 00:39:15.020 physically installed on the server. 00:39:15.020 --> 00:39:17.600 And any layers of indirection like this, so to speak, 00:39:17.600 --> 00:39:19.380 have got to cost you some amount of time. 00:39:19.380 --> 00:39:21.500 If there's some work being done here and you only 00:39:21.500 --> 00:39:23.780 have a finite number of resources, the hypervisor 00:39:23.780 --> 00:39:27.050 itself is surely consuming some of your resources. 00:39:27.050 --> 00:39:28.970 And gosh, this just seems really inefficient, 00:39:28.970 --> 00:39:32.900 especially if all of your customers are using the same operating system. 00:39:32.900 --> 00:39:38.300 My god, why do you have to have copies of the same OS multiply installed? 00:39:38.300 --> 00:39:42.920 This just doesn't feel like it's leveraging much economy of scale. 00:39:42.920 --> 00:39:46.940 And so it turns out there's a newer technology that's gaining steam, 00:39:46.940 --> 00:39:51.260 and this is known not as virtualization, per se, but containerization, 00:39:51.260 --> 00:39:54.980 the most popular instance of which is perhaps a company called Docker. 00:39:54.980 --> 00:39:57.950 And the world of Docker is a little shorter. 00:39:57.950 --> 00:40:00.585 It's a little smarter about how resources are shared. 00:40:00.585 --> 00:40:02.960 You still have your infrastructure, your physical server, 00:40:02.960 --> 00:40:04.876 and you still have your host operating system, 00:40:04.876 --> 00:40:08.100 whether it's Linux or Unix or something like that. 00:40:08.100 --> 00:40:11.810 But then instead of a hypervisor, you have the Docker engine, 00:40:11.810 --> 00:40:15.680 which is really just an equivalent of that base layer of software. 00:40:15.680 --> 00:40:17.930 But notice what's different. 00:40:17.930 --> 00:40:22.387 In this case here, we've collapsed the previous picture. 00:40:22.387 --> 00:40:25.220 In fact, thanks to our friends at Docker who put this together here, 00:40:25.220 --> 00:40:27.800 the guest OS has disappeared. 00:40:27.800 --> 00:40:29.940 And you instead have your different applications 00:40:29.940 --> 00:40:31.730 and your different binaries and libraries, 00:40:31.730 --> 00:40:34.820 as this abbreviation means, all running on the Docker engine. 00:40:34.820 --> 00:40:36.180 Now, what does this mean? 00:40:36.180 --> 00:40:38.660 This means when running Docker, you typically 00:40:38.660 --> 00:40:40.370 choose your operating system-- 00:40:40.370 --> 00:40:44.920 for instance, Ubuntu Linux or Debian Linux or something else altogether-- 00:40:44.920 --> 00:40:49.400 and then you essentially share that one operating system 00:40:49.400 --> 00:40:52.610 across multiple containers. 00:40:52.610 --> 00:40:55.100 Instead of virtual machines, we now have containers. 00:40:55.100 --> 00:40:58.460 So in other words, you ensure that your different slices all 00:40:58.460 --> 00:41:01.850 share some common software-- the kernel, so to speak, 00:41:01.850 --> 00:41:05.060 the base core of the operating system. 00:41:05.060 --> 00:41:09.080 But then you uniquely layer on top of that base system, 00:41:09.080 --> 00:41:13.520 that base set of default files, whatever customizations your customers or you 00:41:13.520 --> 00:41:16.880 yourself want, but you share some of the resources. 00:41:16.880 --> 00:41:19.700 And long story short, what this means is that containers 00:41:19.700 --> 00:41:21.740 tend to be a little lighter weight. 00:41:21.740 --> 00:41:25.490 There's less waste of resources because there's less overhead of running them, 00:41:25.490 --> 00:41:29.720 which is to say that you can generally start them even more quickly. 00:41:29.720 --> 00:41:34.070 And better yet, you can still isolate your different products 00:41:34.070 --> 00:41:37.190 and your different services-- database and web server and email 00:41:37.190 --> 00:41:38.900 server and any number of other features-- 00:41:38.900 --> 00:41:43.340 all within the illusion of their own installation, their own operating 00:41:43.340 --> 00:41:47.030 system, even though there are some shared resources here. 00:41:47.030 --> 00:41:51.950 So this, too, has been made possible by the capabilities of modern hardware 00:41:51.950 --> 00:41:54.590 and the cleverness, frankly, of humans in actually 00:41:54.590 --> 00:42:01.470 finding solutions or creative uses for those available resources. 00:42:01.470 --> 00:42:05.190 But what other features or topics come into play 00:42:05.190 --> 00:42:07.400 in this world of cloud computing? 00:42:07.400 --> 00:42:11.051 We've talked about availability and caching and costing, really 00:42:11.051 --> 00:42:13.050 figuring out where we're going to actually spend 00:42:13.050 --> 00:42:17.430 our money by throwing hardware at problems and scaling more generally. 00:42:17.430 --> 00:42:19.830 But there's also issues of replication, which 00:42:19.830 --> 00:42:22.890 actually do relate to high availability, so to speak. 00:42:22.890 --> 00:42:24.900 But replication refers to duplication of data, 00:42:24.900 --> 00:42:27.360 and really backups more generally as a topic. 00:42:27.360 --> 00:42:30.570 And then there's also some other funky acronyms that are very much in vogue 00:42:30.570 --> 00:42:31.230 these days. 00:42:31.230 --> 00:42:33.390 Besides Infrastructure as a Service, there's 00:42:33.390 --> 00:42:40.280 also Platform as a Service, PaaS, or Software as a Service, SaaS. 00:42:40.280 --> 00:42:43.920 Now, SaaS, even if you've not used it under this name, 00:42:43.920 --> 00:42:45.360 odds are you have been using it. 00:42:45.360 --> 00:42:50.370 If you do use Gmail or Outlook.com or any web-based email service, 00:42:50.370 --> 00:42:52.740 you are using software as a service. 00:42:52.740 --> 00:42:55.572 You don't really know, or need to care, where in the world 00:42:55.572 --> 00:42:58.530 your emails physically live, or how many servers they're spread across, 00:42:58.530 --> 00:43:00.821 or how your data is backed up, or for that matter, when 00:43:00.821 --> 00:43:03.960 you click Send, how the email even gets from point A to point B. 00:43:03.960 --> 00:43:07.710 You are treating Gmail and Outlook as a software 00:43:07.710 --> 00:43:12.690 as a service with all of the underlying implementation details abstracted away. 00:43:12.690 --> 00:43:15.870 You just don't know or care how it's implemented-- well, 00:43:15.870 --> 00:43:17.760 at least if everything is working. 00:43:17.760 --> 00:43:21.090 You probably do care if something goes down. 00:43:21.090 --> 00:43:24.870 But there's this intermediate step between this extreme form 00:43:24.870 --> 00:43:29.010 of abstraction where all you see is just the top-level service. 00:43:29.010 --> 00:43:32.100 And the lowest level implementation that we've 00:43:32.100 --> 00:43:34.110 discussed, which is infrastructure as a service, 00:43:34.110 --> 00:43:36.690 whereby when using something like Amazon, 00:43:36.690 --> 00:43:39.720 you literally click the button that says give me a load balancer. 00:43:39.720 --> 00:43:42.214 You literally click a button that says give me two servers. 00:43:42.214 --> 00:43:44.130 You literally click a button that says give me 00:43:44.130 --> 00:43:46.560 a firewall or any number of other features. 00:43:46.560 --> 00:43:49.560 So Amazon and Microsoft and Google, to some extent, 00:43:49.560 --> 00:43:52.590 have all implemented these low-level services 00:43:52.590 --> 00:43:55.050 that still require that you understand the technology, 00:43:55.050 --> 00:43:59.250 and you understand networking, and you understand scaling and availability. 00:43:59.250 --> 00:44:03.930 But you so much more easily and inexpensively and efficiently-- 00:44:03.930 --> 00:44:08.960 literally with just a laptop or desktop, without any data center of your own-- 00:44:08.960 --> 00:44:12.450 stitch together the topology or the architecture that you actually 00:44:12.450 --> 00:44:14.640 want, albeit in the cloud. 00:44:14.640 --> 00:44:17.820 Platform as a service, though, has arisen as a middle ground 00:44:17.820 --> 00:44:20.557 here, whereby you might have services like Herouku, 00:44:20.557 --> 00:44:22.890 which you might have heard of, which themselves actually 00:44:22.890 --> 00:44:28.950 run on infrastructures like Amazon or Google or Microsoft or the like. 00:44:28.950 --> 00:44:32.100 But they provide themselves a layer of abstraction 00:44:32.100 --> 00:44:34.650 that isn't quite as high level, so to speak, as what 00:44:34.650 --> 00:44:36.510 you get from software as a service. 00:44:36.510 --> 00:44:40.500 In fact, these platforms as a service don't provide you with applications. 00:44:40.500 --> 00:44:44.350 They just make it easier for you to run your applications in the cloud. 00:44:44.350 --> 00:44:45.550 Now, what does that mean? 00:44:45.550 --> 00:44:49.740 Well, it's all fun and exciting to understand load balancing 00:44:49.740 --> 00:44:52.020 and understand networking and understand the need 00:44:52.020 --> 00:44:56.580 for multiple servers and the entire conversation that we've had thus far. 00:44:56.580 --> 00:44:59.390 But at the end of the day, if I'm a software developer 00:44:59.390 --> 00:45:01.380 or I'm trying to build a business, all I care 00:45:01.380 --> 00:45:05.940 about is making my internet application available to real users. 00:45:05.940 --> 00:45:09.390 I really don't care about how many servers I have, 00:45:09.390 --> 00:45:12.802 how many databases I have, how the load balancers talk to one another. 00:45:12.802 --> 00:45:14.760 That's all fine and intellectually interesting. 00:45:14.760 --> 00:45:17.230 But I just want to get real work done. 00:45:17.230 --> 00:45:19.140 So I'm willing to pay a bit more for this. 00:45:19.140 --> 00:45:22.590 I'm willing to pay some middleman, like a Herouku, or any number 00:45:22.590 --> 00:45:24.870 of other services, a platform as a service, 00:45:24.870 --> 00:45:27.450 to abstract away those kinds of details. 00:45:27.450 --> 00:45:30.180 So I have the wherewithal, and I have the willingness 00:45:30.180 --> 00:45:33.740 to actually say host this as a web server. 00:45:33.740 --> 00:45:34.890 So give me a web server. 00:45:34.890 --> 00:45:37.920 I will pay you some number of dollars per month to give me a web server. 00:45:37.920 --> 00:45:41.310 But I want you, Herouku, to deal with the auto scaling of it. 00:45:41.310 --> 00:45:43.330 I don't care how many servers it is. 00:45:43.330 --> 00:45:44.910 I don't care how they are connected. 00:45:44.910 --> 00:45:46.784 I don't care anything about these heartbeats. 00:45:46.784 --> 00:45:49.880 I just want to have the illusion, for my own sake, 00:45:49.880 --> 00:45:53.880 of just one server that somehow grows or shrinks 00:45:53.880 --> 00:45:56.400 dynamically to handle my customer base. 00:45:56.400 --> 00:45:59.381 Meanwhile, things like load balancing, I just want my customers 00:45:59.381 --> 00:46:00.630 to be able to reach my server. 00:46:00.630 --> 00:46:02.046 I don't care how it's implemented. 00:46:02.046 --> 00:46:04.470 I don't care how it's made to be highly available. 00:46:04.470 --> 00:46:06.120 I just want that to work. 00:46:06.120 --> 00:46:10.050 And so companies like Herouku provide these platforms 00:46:10.050 --> 00:46:12.924 as a service that just make your life a little bit easier. 00:46:12.924 --> 00:46:15.840 And you don't have to think about or know about or worry about as many 00:46:15.840 --> 00:46:16.560 of these details. 00:46:16.560 --> 00:46:18.140 Now, to be fair, if something breaks, you 00:46:18.140 --> 00:46:20.140 might not understand exactly what's going wrong, 00:46:20.140 --> 00:46:22.200 and you yourself might not be able to solve it. 00:46:22.200 --> 00:46:27.810 Indeed, you might be entirely at the mercy of the cloud provider, or the PAS 00:46:27.810 --> 00:46:30.150 provider, to solve the problem for you. 00:46:30.150 --> 00:46:32.280 But you're saving time. 00:46:32.280 --> 00:46:34.132 You're saving energy elsewhere by not having 00:46:34.132 --> 00:46:36.840 to worry about those lower-level implementation details, at least 00:46:36.840 --> 00:46:37.631 in the common case. 00:46:37.631 --> 00:46:42.030 But odds are you're paying a little more to Herouku than you would to an Amazon 00:46:42.030 --> 00:46:46.480 directly because they're providing you with this value-added service. 00:46:46.480 --> 00:46:48.900 So as cryptic as these acronyms really mean, 00:46:48.900 --> 00:46:51.210 they're really just referring to disparate levels 00:46:51.210 --> 00:46:54.300 of abstraction, all of which somehow relate to the cloud. 00:46:54.300 --> 00:46:56.790 But infrastructure as a service is a virtualization 00:46:56.790 --> 00:46:59.880 of these hardware ideas, the physical cabling 00:46:59.880 --> 00:47:01.770 that we drew here on the screen. 00:47:01.770 --> 00:47:04.080 Software as a service really is just that application 00:47:04.080 --> 00:47:05.288 that the user interacts with. 00:47:05.288 --> 00:47:09.090 And platform as a service is an intermediate step, 00:47:09.090 --> 00:47:12.180 whereby you, in building your software in the cloud, 00:47:12.180 --> 00:47:16.710 can worry a little bit about how to actually make it available to users. 00:47:16.710 --> 00:47:19.320 But let's consider one other challenge now-- 00:47:19.320 --> 00:47:23.310 that of database replication since, of course, thus far, 00:47:23.310 --> 00:47:26.340 we've been talking about a web server as though it's the entire picture. 00:47:26.340 --> 00:47:28.560 But the reality is most any business that 00:47:28.560 --> 00:47:31.002 has a web-based presence or a mobile presence 00:47:31.002 --> 00:47:32.460 is going to be storing information. 00:47:32.460 --> 00:47:35.249 When users register, when users check something out, 00:47:35.249 --> 00:47:38.040 add something to their shopping cart, so to speak, all of that data 00:47:38.040 --> 00:47:40.560 needs to somehow be stored. 00:47:40.560 --> 00:47:44.830 So let's consider now what the world really likely looks like. 00:47:44.830 --> 00:47:47.070 So here is my laptop again. 00:47:47.070 --> 00:47:51.480 And here is the cloud that's between me and some service 00:47:51.480 --> 00:47:52.860 that I'm interested in. 00:47:52.860 --> 00:47:55.695 We'll assume for now that there is some kind of load balancing. 00:47:55.695 --> 00:47:57.570 And I'm just going to draw it a little bigger 00:47:57.570 --> 00:48:00.957 this time to suggest that-- let's just think of it now as a black box. 00:48:00.957 --> 00:48:02.040 And maybe it's one server. 00:48:02.040 --> 00:48:02.970 Maybe it's two. 00:48:02.970 --> 00:48:03.930 Maybe it's more. 00:48:03.930 --> 00:48:07.200 But somehow or other, load balancing is implemented. 00:48:07.200 --> 00:48:09.990 Then I'm going to have all of my servers here, 00:48:09.990 --> 00:48:14.790 which we'll abstract away as maybe three or more at this point-- one, two, 00:48:14.790 --> 00:48:16.770 and then we'll call this n. 00:48:16.770 --> 00:48:20.200 But a web server typically does not do everything these days. 00:48:20.200 --> 00:48:22.530 In fact, it's been trending for some time 00:48:22.530 --> 00:48:25.440 to actually have different servers or different virtual machines, 00:48:25.440 --> 00:48:27.960 or even more recently, different containers. 00:48:27.960 --> 00:48:30.180 Each provide individual services. 00:48:30.180 --> 00:48:32.440 Sometimes people call these micro services 00:48:32.440 --> 00:48:36.030 if a container only does one, and one very narrowly defined thing, 00:48:36.030 --> 00:48:39.540 like send emails, or save information to a database, 00:48:39.540 --> 00:48:42.610 or respond to HTTP requests. 00:48:42.610 --> 00:48:46.830 So these back end web servers are not the only types of servers we have. 00:48:46.830 --> 00:48:49.780 Odds are we at least have one database. 00:48:49.780 --> 00:48:53.160 So let's consider now the implication of all 00:48:53.160 --> 00:48:56.400 of these architectural decisions we've made thus far 00:48:56.400 --> 00:48:59.400 on how we actually store our data. 00:48:59.400 --> 00:49:03.780 So in simplest form, our database might look like this. 00:49:03.780 --> 00:49:06.510 And for historical reasons, it's generally drawn as a cylinder. 00:49:06.510 --> 00:49:08.580 And this is our database. 00:49:08.580 --> 00:49:12.210 Now, it's immediately obvious that if all servers-- 00:49:12.210 --> 00:49:16.620 1, 2, dot, dot, dot, n-- need to save information or read information 00:49:16.620 --> 00:49:20.800 from a database, they've all got to somehow communicate with that database 00:49:20.800 --> 00:49:25.300 so they all have some kind of connectivity, physically or otherwise. 00:49:25.300 --> 00:49:29.850 So this seems fine so long as the software that's running on servers 1, 00:49:29.850 --> 00:49:32.680 2, dot, dot, dot, and no matter what language we're using, 00:49:32.680 --> 00:49:36.330 whether it's Java or Python or PHP or C# or something else-- 00:49:36.330 --> 00:49:40.740 so long as those servers can talk to, via the network, this database, 00:49:40.740 --> 00:49:41.880 that's great. 00:49:41.880 --> 00:49:44.062 They can all save their data to the same place, 00:49:44.062 --> 00:49:46.270 and they can all read their data from the same place. 00:49:46.270 --> 00:49:48.360 So everything stays nicely in sync. 00:49:48.360 --> 00:49:51.030 But what's the first problem that motivated the entirety 00:49:51.030 --> 00:49:54.680 of this discussion from the outset? 00:49:54.680 --> 00:49:59.440 Well, what if one database isn't really enough? 00:49:59.440 --> 00:50:03.310 Well, we could take the approach of vertically scaling 00:50:03.310 --> 00:50:07.210 our architecture, which is another piece of jargon in this space. 00:50:07.210 --> 00:50:14.530 So vertical scaling means if your one database isn't quite up to snuff, 00:50:14.530 --> 00:50:18.670 and you're running low on disk space or capacity because of numbers 00:50:18.670 --> 00:50:22.360 of requests per second are, of course, limited, you know what you can do? 00:50:22.360 --> 00:50:30.310 You can go ahead and disconnect this one and go ahead and put in a bigger one, 00:50:30.310 --> 00:50:33.130 and therefore increase your capacity. 00:50:33.130 --> 00:50:36.640 And vertical scaling means to really pay more money 00:50:36.640 --> 00:50:39.640 or get something higher end, a higher, more premium 00:50:39.640 --> 00:50:42.730 model, a more expensive model that's got more disk space and more RAM 00:50:42.730 --> 00:50:44.750 and a faster CPU or more CPUs. 00:50:44.750 --> 00:50:46.720 So you just throw hardware at the problem-- 00:50:46.720 --> 00:50:50.680 not in the sense of multiple servers, but just one bigger and better server. 00:50:50.680 --> 00:50:52.117 But what are the challenges here? 00:50:52.117 --> 00:50:53.950 Well, if you've ever bought a home computer, 00:50:53.950 --> 00:50:57.220 odds are whether it's been on Dell's site or Microsoft's or Apple's 00:50:57.220 --> 00:51:00.310 or the like, you often have this good, better, best thing 00:51:00.310 --> 00:51:04.090 where, for the top of the line laptop or desktop, 00:51:04.090 --> 00:51:06.640 you're going to be paying through the roof-- 00:51:06.640 --> 00:51:08.620 through the nose, so to speak. 00:51:08.620 --> 00:51:11.470 You're going to be paying a premium for that top of the line model. 00:51:11.470 --> 00:51:14.178 But you might actually be able to save a decent number of dollars 00:51:14.178 --> 00:51:17.470 by going for the second best or the third best, 00:51:17.470 --> 00:51:21.340 because the marginal gains of each additional dollar 00:51:21.340 --> 00:51:22.695 really aren't all that much. 00:51:22.695 --> 00:51:24.820 Because for marketing reasons, they know that there 00:51:24.820 --> 00:51:26.778 might be some people out there that will always 00:51:26.778 --> 00:51:28.330 pay top dollar for the fastest one. 00:51:28.330 --> 00:51:30.163 But just because you're paying twice as much 00:51:30.163 --> 00:51:33.650 doesn't mean the laptops is going to be twice as good, for instance. 00:51:33.650 --> 00:51:37.120 So this is to say to vertically scale your database, you might end up 00:51:37.120 --> 00:51:40.810 paying, through the nose, some very expensive hardware just 00:51:40.810 --> 00:51:43.820 to eke out some more performance. 00:51:43.820 --> 00:51:45.610 But that's not even the biggest problem. 00:51:45.610 --> 00:51:48.350 The most fundamental problem is at the end of the day, 00:51:48.350 --> 00:51:53.050 there is a top-of-the-line server for your database that only can support 00:51:53.050 --> 00:51:56.020 a finite number of database connections at a time, 00:51:56.020 --> 00:51:58.480 or a finite number of reads or writes, so to speak, 00:51:58.480 --> 00:52:00.357 saving and reading from the database. 00:52:00.357 --> 00:52:03.190 So at some point or other, it doesn't matter how much money you have 00:52:03.190 --> 00:52:05.530 or how willing you are to throw hardware at the problem. 00:52:05.530 --> 00:52:10.070 There exists no server that can handle more users than you currently have. 00:52:10.070 --> 00:52:14.320 So at some point, you actually have to put away your wallet 00:52:14.320 --> 00:52:17.680 and put back on the engineering hat alone and figure out 00:52:17.680 --> 00:52:24.220 how to not vertically scale, but horizontally scale your architecture. 00:52:24.220 --> 00:52:29.860 And by this, I mean actually introducing not just one big, fancy server, 00:52:29.860 --> 00:52:33.112 but two or more maybe smaller, cheaper servers. 00:52:33.112 --> 00:52:35.320 In fact, one of the things that companies like Google 00:52:35.320 --> 00:52:37.810 were especially good at early on was using 00:52:37.810 --> 00:52:42.910 off-the-shelf, inexpensive hardware and building supercomputers out of them, 00:52:42.910 --> 00:52:44.770 but much more economically than they might 00:52:44.770 --> 00:52:46.450 have had they gone top of the line everywhere, 00:52:46.450 --> 00:52:48.200 even though that would mean fewer servers. 00:52:48.200 --> 00:52:50.696 Better to get more cheaper servers and somehow 00:52:50.696 --> 00:52:53.320 figure out how to interconnect them and write the software that 00:52:53.320 --> 00:52:56.410 lets them all be useful simultaneously so that we can instead 00:52:56.410 --> 00:52:59.620 have a picture that looks a bit more like this, with maybe 00:52:59.620 --> 00:53:03.310 a pair of databases in the picture now. 00:53:03.310 --> 00:53:05.650 Of course, we've now created that same problem 00:53:05.650 --> 00:53:09.130 that we had earlier about where does the data go. 00:53:09.130 --> 00:53:10.990 Where does the traffic or the users flow, 00:53:10.990 --> 00:53:14.960 especially now where we have one on the left and one on the right? 00:53:14.960 --> 00:53:18.460 So there's a couple of solutions here, but there are some different problems 00:53:18.460 --> 00:53:19.900 that arise with databases. 00:53:19.900 --> 00:53:27.600 If we very simply put a load balancer in here, LB, and route traffic uniformly-- 00:53:27.600 --> 00:53:30.010 say, to the left or to the right-- 00:53:30.010 --> 00:53:32.180 that's probably not the best thing. 00:53:32.180 --> 00:53:34.960 Because then you're going to end up with a world where you're 00:53:34.960 --> 00:53:39.460 saving some data for a user here and some data for a user 00:53:39.460 --> 00:53:42.757 here just by chance, because you're using round robin, so to speak, 00:53:42.757 --> 00:53:45.340 or just some probabilistic heuristic where some of the traffic 00:53:45.340 --> 00:53:47.140 goes this way, some of the traffic goes that way. 00:53:47.140 --> 00:53:48.160 And that's not so good. 00:53:48.160 --> 00:53:48.670 OK. 00:53:48.670 --> 00:53:54.820 But we could solve that by somehow making sure that if this user, User A, 00:53:54.820 --> 00:54:00.400 visits my web site, I should always send him or her to the same database. 00:54:00.400 --> 00:54:02.440 And you can do this in a couple of ways. 00:54:02.440 --> 00:54:04.510 You can enforce some notion of stickiness, 00:54:04.510 --> 00:54:07.180 so to speak, whereby you somehow notice that, oh, this is 00:54:07.180 --> 00:54:09.010 User A. We've seen him or her before. 00:54:09.010 --> 00:54:12.130 Let's make sure we send him to this database on the left 00:54:12.130 --> 00:54:14.020 and not the one on the right. 00:54:14.020 --> 00:54:18.070 Or you can more formally use a process known as sharding. 00:54:18.070 --> 00:54:20.380 In fact, this is very common early on in databases, 00:54:20.380 --> 00:54:24.010 and even in websites like Facebook, where you have so many users 00:54:24.010 --> 00:54:26.830 that you need to start splitting them across multiple databases. 00:54:26.830 --> 00:54:28.360 But gosh, how to do that? 00:54:28.360 --> 00:54:31.300 Back in the earliest days of Facebook, what they might have done 00:54:31.300 --> 00:54:35.275 was put all Harvard users on one database, all MIT users on another, 00:54:35.275 --> 00:54:37.600 all BU users on another, and so forth. 00:54:37.600 --> 00:54:40.210 Because Facebook, as you may recall, started scaling out 00:54:40.210 --> 00:54:41.620 initially to disparate schools. 00:54:41.620 --> 00:54:44.410 That was a wonderful opportunity to shard 00:54:44.410 --> 00:54:49.930 their data by putting similar users in their respective databases. 00:54:49.930 --> 00:54:51.700 And at the time, I think you couldn't even 00:54:51.700 --> 00:54:54.240 be friends with people in other schools, at least very early 00:54:54.240 --> 00:54:58.050 on, because those databases, presumably, were independent, 00:54:58.050 --> 00:55:01.440 or certainly could have been topologicaly. 00:55:01.440 --> 00:55:04.530 Or you might do something more simple that doesn't create 00:55:04.530 --> 00:55:06.420 some problems like isolation there. 00:55:06.420 --> 00:55:10.320 Maybe all of your users whose last name start with A go on one server, 00:55:10.320 --> 00:55:12.400 and all of your users whose names start with B 00:55:12.400 --> 00:55:14.170 go on another server, and so forth. 00:55:14.170 --> 00:55:18.330 So you can almost hash your users, to borrow a terminology from hash tables, 00:55:18.330 --> 00:55:20.970 and decide where to put that data. 00:55:20.970 --> 00:55:24.690 Of course, that does not help with backups or redundancy. 00:55:24.690 --> 00:55:28.470 Because if you're putting all of your A names here and all of your B names 00:55:28.470 --> 00:55:31.230 here, what happens, god forbid, if one of the servers goes down? 00:55:31.230 --> 00:55:33.600 You've lost half of your customers. 00:55:33.600 --> 00:55:36.390 So it would seem that no matter how you balance the load, 00:55:36.390 --> 00:55:39.850 you really want to maintain duplicates of data. 00:55:39.850 --> 00:55:42.870 And so there's a few different ways people solve this. 00:55:42.870 --> 00:55:45.510 In fact, let me go ahead and temporarily go 00:55:45.510 --> 00:55:50.940 back to that first model, where we had a really fancy, bigger database 00:55:50.940 --> 00:55:53.670 that I'll deliberately draw as pretty big. 00:55:53.670 --> 00:55:57.450 And this is big in the sense that it can respond to requests quickly 00:55:57.450 --> 00:55:59.280 and it can store a lot of data. 00:55:59.280 --> 00:56:03.630 This might be generally called our primary or our master database. 00:56:03.630 --> 00:56:06.420 And it's where our data goes to live long term. 00:56:06.420 --> 00:56:09.900 It's where data is written to, so to speak, and could also be read from. 00:56:09.900 --> 00:56:13.380 But if we're going to bump up against some limit of how much work 00:56:13.380 --> 00:56:15.450 this database can do at once, it would be 00:56:15.450 --> 00:56:18.960 nice to have some secondary servers or tertiary servers. 00:56:18.960 --> 00:56:24.240 So a very common paradigm would be to use this primary database for writes-- 00:56:24.240 --> 00:56:25.860 we'll abbreviate it w-- 00:56:25.860 --> 00:56:29.790 and then also have maybe a couple of smaller databases, or even 00:56:29.790 --> 00:56:35.610 the same size databases, that are meant for reads, abbreviated R. 00:56:35.610 --> 00:56:39.600 And so long as these databases are somehow talking to one another, 00:56:39.600 --> 00:56:41.400 this topology will just work. 00:56:41.400 --> 00:56:43.410 This is a feature known as replication. 00:56:43.410 --> 00:56:46.650 So long as the databases are configured in such a way 00:56:46.650 --> 00:56:50.310 that any time data is written to the primary database or the master 00:56:50.310 --> 00:56:55.440 database, that data gets replicated to any replicas, as they're called. 00:56:55.440 --> 00:57:02.539 Meanwhile, servers 1, 2, and n should also be able to talk to these replicas. 00:57:02.539 --> 00:57:05.580 And if your code is smart enough-- and you would have to think about this 00:57:05.580 --> 00:57:10.260 and design this into your codebase-- you could ensure that any time you 00:57:10.260 --> 00:57:15.270 read data from a database, it comes from one, or really any, of your replicas, 00:57:15.270 --> 00:57:18.240 replicas in the sense that they are meant to have duplicate data. 00:57:18.240 --> 00:57:22.170 But anytime you write data-- a SQL INSERT or UPDATE or DELETE, 00:57:22.170 --> 00:57:24.150 as opposed to a SQL SELECT-- 00:57:24.150 --> 00:57:28.710 you only send your write operations to the primary or master database 00:57:28.710 --> 00:57:32.026 and leave it to it to then replicate it to the read replicas. 00:57:32.026 --> 00:57:33.900 Now, of course, there are some problems here. 00:57:33.900 --> 00:57:35.070 There's some latency, potentially. 00:57:35.070 --> 00:57:36.320 Maybe it takes a split second. 00:57:36.320 --> 00:57:39.190 Maybe it takes a couple seconds for that data to replicate. 00:57:39.190 --> 00:57:43.440 So things might not appear to be updated instantaneously. 00:57:43.440 --> 00:57:48.900 But you have now a very scalable model in that if you have the money to spend, 00:57:48.900 --> 00:57:54.000 you can even have more read replicas and have even more and more read capacity. 00:57:54.000 --> 00:57:58.080 Of course, you're going to eventually bump up against a limit on your rights, 00:57:58.080 --> 00:58:00.840 at which point we need to introduce another solution. 00:58:00.840 --> 00:58:03.690 But again, this is a very incremental approach. 00:58:03.690 --> 00:58:06.890 And we can throw a little bit of money at the problem each time 00:58:06.890 --> 00:58:09.090 and a little bit of engineering wherewithal 00:58:09.090 --> 00:58:11.491 in order to at least get us over that next ledge, which 00:58:11.491 --> 00:58:14.490 is super important, certainly, when you're first building your business. 00:58:14.490 --> 00:58:17.582 If You don't necessarily have the resources to go all in on things, 00:58:17.582 --> 00:58:19.290 you at least want to get over this hurdle 00:58:19.290 --> 00:58:24.340 or at least build in some capacity for the next load of users. 00:58:24.340 --> 00:58:27.300 So what if we run out of capacity, though, 00:58:27.300 --> 00:58:31.090 with that that writable server, the master database, so to speak? 00:58:31.090 --> 00:58:33.270 We need to be a little more clever. 00:58:33.270 --> 00:58:37.860 And it turns out we can borrow this idea of these horizontal arrows 00:58:37.860 --> 00:58:43.110 here to replicate our data, but for a slightly different purpose. 00:58:43.110 --> 00:58:47.400 We could still have a pretty souped up writable database. 00:58:47.400 --> 00:58:51.750 But we could have another one, maybe identical in its specs, writable. 00:58:51.750 --> 00:58:54.960 But somehow, these things need to be able to synchronize with themselves. 00:58:54.960 --> 00:58:57.920 And maybe there's still some read replicas over here-- 00:58:57.920 --> 00:59:01.200 R for read, and another one over here, R for read. 00:59:01.200 --> 00:59:03.750 And these are all somehow interconnected as well. 00:59:03.750 --> 00:59:07.860 But you can have what's called master master replication, whereby 00:59:07.860 --> 00:59:12.144 your server's code writes to one of these servers. 00:59:12.144 --> 00:59:13.560 And maybe it's either of them now. 00:59:13.560 --> 00:59:15.840 Maybe the load balancer actually does send some of the writes 00:59:15.840 --> 00:59:17.423 this way, some of the writes this way. 00:59:17.423 --> 00:59:20.190 But the master database, the writable ones now, 00:59:20.190 --> 00:59:24.534 are configured, in software, to replicate horizontally, so to speak. 00:59:24.534 --> 00:59:26.700 So here too, you might have a little bit of latency. 00:59:26.700 --> 00:59:28.491 It might take a few milliseconds or seconds 00:59:28.491 --> 00:59:30.060 for the data to actually replicate. 00:59:30.060 --> 00:59:34.800 But at least now we've doubled the capacity for our writes 00:59:34.800 --> 00:59:38.560 so as to handle twice as many writable operations. 00:59:38.560 --> 00:59:41.790 And we can continue to hang more and more read replicas off of these 00:59:41.790 --> 00:59:46.740 if you want in order to handle more and more users. 00:59:46.740 --> 00:59:50.950 And so this is the challenge and, dare say, the fun of engineering 00:59:50.950 --> 00:59:54.192 architecturally-- understanding some of these basic building blocks. 00:59:54.192 --> 00:59:56.650 And even if you might not know the particular manufacturers 00:59:56.650 --> 00:59:59.140 or how you physically configure the servers, 00:59:59.140 --> 01:00:02.300 or how in software you configure these servers, at the end of the day, 01:00:02.300 --> 01:00:05.950 these really are just puzzle pieces that can somehow be interlocked. 01:00:05.950 --> 01:00:09.280 And these puzzle pieces can be used to solve more and more interesting 01:00:09.280 --> 01:00:10.090 problems. 01:00:10.090 --> 01:00:15.760 But to our discussion PaaS and Software as a Service and Infrastructure 01:00:15.760 --> 01:00:19.490 as a Service, there's also these different layers of abstraction. 01:00:19.490 --> 01:00:22.810 And so thematic throughout this in all of our discussions 01:00:22.810 --> 01:00:23.850 has been this layering. 01:00:23.850 --> 01:00:27.802 Indeed, we started, really, down here with those zeros and ones and bits, 01:00:27.802 --> 01:00:30.010 and very quickly went to Ascii, and very quickly went 01:00:30.010 --> 01:00:33.010 to colors and images and videos and so forth. 01:00:33.010 --> 01:00:36.042 Because once you understand some of those ingredients or puzzle pieces, 01:00:36.042 --> 01:00:37.750 can you build something more interesting? 01:00:37.750 --> 01:00:39.790 And then can you slap a name on it-- 01:00:39.790 --> 01:00:43.392 sometimes cryptic, like IaaS, or PaaS, or SaaS? 01:00:43.392 --> 01:00:45.100 But at the end of the day, those are just 01:00:45.100 --> 01:00:48.580 labels that describe, really, black boxes, inside of which 01:00:48.580 --> 01:00:52.360 is a decent amount of complexity, a clever amount of engineering, 01:00:52.360 --> 01:00:55.270 but ultimately, a solution to a problem. 01:00:55.270 --> 01:00:58.810 And so in cloud computing, do we really have this catch-all phrase that's 01:00:58.810 --> 01:01:03.430 referring to a whole class of solutions to problems that ultimately are all 01:01:03.430 --> 01:01:07.090 about getting one's business or getting one's personal website 01:01:07.090 --> 01:01:11.140 out on the internet for users to access, whether via laptops or desktops 01:01:11.140 --> 01:01:12.820 or mobile devices and more? 01:01:12.820 --> 01:01:14.710 So at the end of the day, what is the cloud? 01:01:14.710 --> 01:01:16.240 It's this evolving definition. 01:01:16.240 --> 01:01:21.230 It's this evolving class of services that just continues to grow. 01:01:21.230 --> 01:01:23.840 But each of those services is solving a problem. 01:01:23.840 --> 01:01:30.880 Each of those problems derives from plugging one hole in a leaky hose, 01:01:30.880 --> 01:01:33.580 seeing another one spring up, and then addressing that one, 01:01:33.580 --> 01:01:36.430 and then layering on top of those solutions these are abstractions, 01:01:36.430 --> 01:01:39.700 and ultimately some marketing speak, like cloud computing itself, 01:01:39.700 --> 01:01:43.120 so that you can build, out of these more sophisticated puzzle pieces, 01:01:43.120 --> 01:01:45.640 bigger and better solutions to actual problems 01:01:45.640 --> 01:01:49.860 you have when you're trying to build your own site.