1 00:00:00,000 --> 00:00:10,360 2 00:00:10,360 --> 00:00:12,340 Cloud computing-- it's this term that rather 3 00:00:12,340 --> 00:00:14,330 swept onto the scene in recent years. 4 00:00:14,330 --> 00:00:17,230 And it sounds like it's some new and trendy technology. 5 00:00:17,230 --> 00:00:19,769 But in reality, it's really just a very nice packaging 6 00:00:19,769 --> 00:00:22,060 up of a whole number of technologies that have actually 7 00:00:22,060 --> 00:00:23,620 been with us for some time. 8 00:00:23,620 --> 00:00:26,710 In fact, cloud computing, in its simplest form, 9 00:00:26,710 --> 00:00:29,800 can really be thought of as just outsourcing 10 00:00:29,800 --> 00:00:33,250 the hosting of your applications and really outsourcing 11 00:00:33,250 --> 00:00:36,790 the hosting of your physical servers to someone else-- put another way, 12 00:00:36,790 --> 00:00:41,140 renting space and renting time on someone else's computers. 13 00:00:41,140 --> 00:00:45,520 But these days, we just have so much computational capabilities-- that is, 14 00:00:45,520 --> 00:00:50,650 our computers are so fast, our CPUs are so many, and we have so much RAM-- 15 00:00:50,650 --> 00:00:53,500 that new and fancier technologies have lent themselves 16 00:00:53,500 --> 00:00:56,590 to this trend of hosting all the more software 17 00:00:56,590 --> 00:01:00,520 and putting all of the more hardware off-site in the so-called cloud 18 00:01:00,520 --> 00:01:05,170 so that companies, both big and small, no longer need 19 00:01:05,170 --> 00:01:09,010 to host their own physical hardware or even a whole number of roles 20 00:01:09,010 --> 00:01:10,390 in their own local companies. 21 00:01:10,390 --> 00:01:13,120 And so what we'll do now is dive into cloud computing, 22 00:01:13,120 --> 00:01:15,280 look at some of the problems it solves, look 23 00:01:15,280 --> 00:01:18,190 at some of the opportunities it affords, but ultimately, 24 00:01:18,190 --> 00:01:20,890 take a look from the ground up at what's underneath the hood 25 00:01:20,890 --> 00:01:23,080 here so that by the end of this, we have a better 26 00:01:23,080 --> 00:01:25,900 understanding of what the cloud is, why it is useful, 27 00:01:25,900 --> 00:01:28,030 and what it actually is not. 28 00:01:28,030 --> 00:01:31,490 So with that said, let's start with a simple scenario. 29 00:01:31,490 --> 00:01:34,990 Of course, the cloud perhaps derives its origins 30 00:01:34,990 --> 00:01:37,630 from how the internet, for some time, was drawn, 31 00:01:37,630 --> 00:01:40,510 which was just this big, nebulous cloud, in that it doesn't really 32 00:01:40,510 --> 00:01:41,950 matter what's inside that cloud. 33 00:01:41,950 --> 00:01:46,330 Although at this point, you most surely appreciate that inside of this cloud 34 00:01:46,330 --> 00:01:49,450 are things like routers, and running through those routers 35 00:01:49,450 --> 00:01:51,937 are packets, both TCP/IP and the like. 36 00:01:51,937 --> 00:01:53,770 And underneath the hood, then, of this cloud 37 00:01:53,770 --> 00:01:57,520 is some transport mechanism that gets data from point A to point B. 38 00:01:57,520 --> 00:02:00,550 So what might those point A's and Point B's be? 39 00:02:00,550 --> 00:02:04,180 Well, if this here is my little, old laptop, connected somehow 40 00:02:04,180 --> 00:02:07,510 to the internet here, and maybe down here there 41 00:02:07,510 --> 00:02:11,500 is some web server on which lives a whole bunch of web pages-- 42 00:02:11,500 --> 00:02:12,550 maybe it's my email. 43 00:02:12,550 --> 00:02:13,840 Maybe it's the day's news. 44 00:02:13,840 --> 00:02:16,240 Maybe it's some social media site or the like. 45 00:02:16,240 --> 00:02:21,190 I, at point A, want to somehow connect to point B down here. 46 00:02:21,190 --> 00:02:24,790 Now, it turns out it's not all that hard to get a website up 47 00:02:24,790 --> 00:02:26,230 and running on the internet. 48 00:02:26,230 --> 00:02:28,540 You can, of course, use any number of languages. 49 00:02:28,540 --> 00:02:30,760 You can use any number of databases. 50 00:02:30,760 --> 00:02:34,270 And you can do it with relatively little experience, 51 00:02:34,270 --> 00:02:36,320 just getting something on the internet. 52 00:02:36,320 --> 00:02:39,250 In fact, it's not all that hard, relatively speaking, 53 00:02:39,250 --> 00:02:41,740 to get a prototype of your application or even 54 00:02:41,740 --> 00:02:44,590 your first version of your business up and running. 55 00:02:44,590 --> 00:02:50,200 But things start to get hard quickly, especially if you have some success. 56 00:02:50,200 --> 00:02:53,320 Indeed, a good problem to have is that you have so many customers and so 57 00:02:53,320 --> 00:02:57,100 many users hitting your websites that you can't actually 58 00:02:57,100 --> 00:02:58,450 handle all of the load. 59 00:02:58,450 --> 00:03:01,365 Now, it's a good problem in the sense that business is booming. 60 00:03:01,365 --> 00:03:03,490 But it's, of course, an actual problem in the sense 61 00:03:03,490 --> 00:03:06,160 that your customers aren't going to be able to visit your web site 62 00:03:06,160 --> 00:03:08,110 and buy whatever it is you're selling or read 63 00:03:08,110 --> 00:03:12,520 whatever it is you're posting if your servers can't actually handle the load. 64 00:03:12,520 --> 00:03:16,780 And by load, I simply mean the number of users per minute or per unit of time 65 00:03:16,780 --> 00:03:19,390 that your website is actually experiencing. 66 00:03:19,390 --> 00:03:21,670 And its capacity, meanwhile, would be the number 67 00:03:21,670 --> 00:03:23,500 of users it can actually support. 68 00:03:23,500 --> 00:03:25,760 Now, why are there these limits in the first place? 69 00:03:25,760 --> 00:03:28,030 Well, you may recall that inside of a computer 70 00:03:28,030 --> 00:03:30,640 is a CPU, the brains of that computer. 71 00:03:30,640 --> 00:03:33,580 And inside of a computer is some memory, like RAM. 72 00:03:33,580 --> 00:03:36,940 And there might be some longer-term storage, like hard disk space. 73 00:03:36,940 --> 00:03:41,320 At the end of the day, all of those resources and more are finite. 74 00:03:41,320 --> 00:03:44,290 You can only fit so much physical hardware in a computer. 75 00:03:44,290 --> 00:03:47,500 Humans have only been able to pack so many resources 76 00:03:47,500 --> 00:03:49,894 into the physical space of a computer. 77 00:03:49,894 --> 00:03:51,310 And then, of course, there's cost. 78 00:03:51,310 --> 00:03:54,920 You might be able to only afford so much computing capacity. 79 00:03:54,920 --> 00:03:58,800 So if a computer can only do some number of things per second, 80 00:03:58,800 --> 00:04:02,154 there is surely an upper bound on how many people can visit your web 81 00:04:02,154 --> 00:04:05,320 site, how many people can add things to their shopping cart, how many people 82 00:04:05,320 --> 00:04:07,330 can check out with their credit card. 83 00:04:07,330 --> 00:04:10,940 Because you only have, at the end of the day, a finite numbers of resources. 84 00:04:10,940 --> 00:04:13,060 Now, what does that mean in real terms? 85 00:04:13,060 --> 00:04:16,540 Well, maybe your web server can handle 100 users per minute. 86 00:04:16,540 --> 00:04:18,700 Maybe it can handle 1,000 users per minute. 87 00:04:18,700 --> 00:04:22,240 Maybe it can handle 1,000 users per second, or even much more than that. 88 00:04:22,240 --> 00:04:26,887 It really depends on the specifications of your hardware-- how much RAM, 89 00:04:26,887 --> 00:04:29,470 how much CPU and so forth that you actually have-- and it also 90 00:04:29,470 --> 00:04:33,430 depends, to some extent, on how well-written your code is and how fast 91 00:04:33,430 --> 00:04:37,280 or how slow your code, your software actually runs. 92 00:04:37,280 --> 00:04:39,700 So these are knobs that can ultimately be turned. 93 00:04:39,700 --> 00:04:42,310 And through testing, can you figure this out in advance 94 00:04:42,310 --> 00:04:46,470 by simulating traffic in order to estimate exactly how many users you 95 00:04:46,470 --> 00:04:49,000 might be able to handle at a time? 96 00:04:49,000 --> 00:04:53,200 Now, the relevance to today is that the cloud, so to speak, 97 00:04:53,200 --> 00:04:56,200 allows us to start to solve some of these problems 98 00:04:56,200 --> 00:04:59,890 and also allows us to start abstracting away the solutions to some 99 00:04:59,890 --> 00:05:00,640 of these problems. 100 00:05:00,640 --> 00:05:02,360 Well, let's see what this actually means. 101 00:05:02,360 --> 00:05:04,449 So at some point or other-- 102 00:05:04,449 --> 00:05:06,490 especially when it's not just my laptop, but it's 103 00:05:06,490 --> 00:05:10,160 like 1,000 laptops, or 10,000 laptops and desktops and phones and more 104 00:05:10,160 --> 00:05:12,860 that are somehow trying to access my server here-- 105 00:05:12,860 --> 00:05:17,000 at some point, we hit that upper limit whereby no more users can 106 00:05:17,000 --> 00:05:19,200 fit onto my web site per unit of time. 107 00:05:19,200 --> 00:05:22,070 So what is the symptom that my users experience at that point 108 00:05:22,070 --> 00:05:23,660 if I'm over capacity? 109 00:05:23,660 --> 00:05:26,210 Well, they might see an error message of some sort. 110 00:05:26,210 --> 00:05:28,550 They might just experience a spinning icon 111 00:05:28,550 --> 00:05:30,662 because the website is super slow to respond. 112 00:05:30,662 --> 00:05:33,120 And maybe it does respond, but maybe it's 10 seconds later. 113 00:05:33,120 --> 00:05:36,890 So at the end of the day, they either have a bad experience or no experience 114 00:05:36,890 --> 00:05:42,480 whatsoever, because my server can only handle so many requests at a time. 115 00:05:42,480 --> 00:05:44,570 So what do you do to solve this problem? 116 00:05:44,570 --> 00:05:48,500 If one server is not enough, maybe the most intuitive solution is, well, 117 00:05:48,500 --> 00:05:51,500 if one server is not giving me enough headroom, 118 00:05:51,500 --> 00:05:53,490 why don't I just have two servers? 119 00:05:53,490 --> 00:05:54,890 So let's go ahead and do that. 120 00:05:54,890 --> 00:05:58,370 Instead of having just one server, let's go ahead and have two. 121 00:05:58,370 --> 00:06:01,790 And let me propose that on the second server, it's the exact same software. 122 00:06:01,790 --> 00:06:05,150 So whatever code I've written, in whatever language it's written, 123 00:06:05,150 --> 00:06:08,600 I just have copies of my web site on both the original server 124 00:06:08,600 --> 00:06:10,520 and the second server. 125 00:06:10,520 --> 00:06:14,270 Now I've solved the problem in the simple sense 126 00:06:14,270 --> 00:06:16,010 that I've doubled my capacity. 127 00:06:16,010 --> 00:06:18,570 If one server can handle 1,000 people per second, 128 00:06:18,570 --> 00:06:21,714 well, then surely two servers can handle 2,000 people per second, 129 00:06:21,714 --> 00:06:22,880 so I've doubled my capacity. 130 00:06:22,880 --> 00:06:23,870 So that's good. 131 00:06:23,870 --> 00:06:26,000 I've hopefully solved the problem. 132 00:06:26,000 --> 00:06:27,929 But it's not quite as simple as that. 133 00:06:27,929 --> 00:06:30,845 At least pictorially, I'm still pointing at just one of those servers, 134 00:06:30,845 --> 00:06:33,890 so we're going to have to clean up this picture alone and somehow 135 00:06:33,890 --> 00:06:36,680 figure out how to get users-- 136 00:06:36,680 --> 00:06:38,450 or more generally, traffic-- 137 00:06:38,450 --> 00:06:40,820 to both of these servers. 138 00:06:40,820 --> 00:06:43,740 I could just naively draw an arrow like this. 139 00:06:43,740 --> 00:06:45,380 But what does that actually mean? 140 00:06:45,380 --> 00:06:47,780 We don't want to abstract away so much of the detail 141 00:06:47,780 --> 00:06:50,460 that we're ignoring this problem. 142 00:06:50,460 --> 00:06:54,860 How do we implement this notion of choosing between left arrow and right 143 00:06:54,860 --> 00:06:55,580 arrow? 144 00:06:55,580 --> 00:06:59,210 Well, let's consider what our solutions might be. 145 00:06:59,210 --> 00:07:02,930 If a user, like me on my laptop, is trying to visit this web site-- 146 00:07:02,930 --> 00:07:06,740 and the web site, ideally, is going to live at something like example.com, 147 00:07:06,740 --> 00:07:10,122 or facebook.com, or gmail.com, or whatever-- 148 00:07:10,122 --> 00:07:12,830 I don't want to have to broadcast different names for my servers. 149 00:07:12,830 --> 00:07:14,910 And you might actually notice this on the internet. 150 00:07:14,910 --> 00:07:18,160 You might notice, if you start noticing the URLs of websites you're visiting-- 151 00:07:18,160 --> 00:07:21,290 especially for certain older, stodgier companies who haven't necessarily 152 00:07:21,290 --> 00:07:23,240 implemented this in the most modern way-- 153 00:07:23,240 --> 00:07:27,239 you might find yourself not just at www.something.com, 154 00:07:27,239 --> 00:07:29,780 but if you look closely, you might find yourself occasionally 155 00:07:29,780 --> 00:07:35,310 at www1.something.com, www2.something.com, 156 00:07:35,310 --> 00:07:38,590 or even www13.something.com. 157 00:07:38,590 --> 00:07:43,670 Which is to say that some companies appear to solve this problem by just 158 00:07:43,670 --> 00:07:45,140 giving different names-- 159 00:07:45,140 --> 00:07:48,920 similar names, but different names-- to their two servers, three servers, 160 00:07:48,920 --> 00:07:51,200 13 servers, or however many they have. 161 00:07:51,200 --> 00:07:54,770 And then they somehow redirect users from their main domain 162 00:07:54,770 --> 00:08:00,440 name, www.something.com, to any one of those two or three or 13 servers. 163 00:08:00,440 --> 00:08:01,919 But this isn't very elegant. 164 00:08:01,919 --> 00:08:03,710 The marketing folks would surely hate this, 165 00:08:03,710 --> 00:08:06,770 because you're trying to build some brand recognition around your URL. 166 00:08:06,770 --> 00:08:10,370 Why would you dirty it by just putting these arbitrary numbers in the URLs? 167 00:08:10,370 --> 00:08:13,940 Plus if you fast forward a bit in this story, 168 00:08:13,940 --> 00:08:16,610 if, for some reason down the road, you get fancier, 169 00:08:16,610 --> 00:08:18,382 bigger servers that can handle more users, 170 00:08:18,382 --> 00:08:20,090 and therefore you don't need 13 of them-- 171 00:08:20,090 --> 00:08:22,220 you can get away with just six of them-- 172 00:08:22,220 --> 00:08:24,950 well, what happens if some of your customers have bookmarked, 173 00:08:24,950 --> 00:08:30,320 very reasonably, one of those older names, like www13.something.com? 174 00:08:30,320 --> 00:08:33,799 So now when they try to visit that URL, gosh, they might hit a dead end. 175 00:08:33,799 --> 00:08:35,632 So you could solve that in some other way. 176 00:08:35,632 --> 00:08:38,090 But the point is it would seem to create a problem quickly, 177 00:08:38,090 --> 00:08:40,159 and it's just a naming mess. 178 00:08:40,159 --> 00:08:44,270 Why actually bother having your users see something 179 00:08:44,270 --> 00:08:46,020 as messy as these numbered servers? 180 00:08:46,020 --> 00:08:49,400 It would be nice to do this a little more transparently. 181 00:08:49,400 --> 00:08:50,930 So how could we do this? 182 00:08:50,930 --> 00:08:54,140 Well, let me propose that we kind of need some middleman here, 183 00:08:54,140 --> 00:08:57,740 so to speak, whereby traffic comes from people like me on the internet 184 00:08:57,740 --> 00:09:00,400 and then either goes to the left or goes to the right, 185 00:09:00,400 --> 00:09:02,780 or no matter how many servers we have, goes 186 00:09:02,780 --> 00:09:05,750 to one of those actual web servers. 187 00:09:05,750 --> 00:09:09,500 So how does this middleman-- and to borrow some past terminology, 188 00:09:09,500 --> 00:09:12,320 how does this black box potentially work? 189 00:09:12,320 --> 00:09:14,030 Well, let's consider some of the building 190 00:09:14,030 --> 00:09:18,530 blocks, some of the puzzle pieces we have technologically at our disposal 191 00:09:18,530 --> 00:09:19,400 now. 192 00:09:19,400 --> 00:09:21,770 You may recall that every server on the internet 193 00:09:21,770 --> 00:09:25,641 has an IP address, an internet protocol address, a unique address for it. 194 00:09:25,641 --> 00:09:27,890 And that's, again, a bit of a white lie, because there 195 00:09:27,890 --> 00:09:30,410 are technologies by which you can have private IP 196 00:09:30,410 --> 00:09:32,690 addresses that the outside world doesn't see. 197 00:09:32,690 --> 00:09:35,780 But let's stipulate, for today's purposes, 198 00:09:35,780 --> 00:09:38,810 that every computer on the internet certainly has an IP 199 00:09:38,810 --> 00:09:41,550 address, whether public or private. 200 00:09:41,550 --> 00:09:46,400 So maybe, just maybe, we could leverage an existing technology-- 201 00:09:46,400 --> 00:09:48,680 DNS, the Domain Name System-- 202 00:09:48,680 --> 00:09:52,880 so that rather than only return one IP address of a server 203 00:09:52,880 --> 00:09:57,470 when you look up www.something.com, we return the IP address 204 00:09:57,470 --> 00:09:59,610 of the server on the left some of the time 205 00:09:59,610 --> 00:10:02,990 or the IP address of the server on the right some of the time, 206 00:10:02,990 --> 00:10:07,160 effectively balancing our load, our traffic across the two servers. 207 00:10:07,160 --> 00:10:10,060 And in fact, if you do this 50-50, you can 208 00:10:10,060 --> 00:10:12,640 take, really, what's called a round robin approach, 209 00:10:12,640 --> 00:10:17,270 and ideally uniformly distribute your traffic across multiple servers. 210 00:10:17,270 --> 00:10:20,140 And what's nice in this model is that because you're using DNS, 211 00:10:20,140 --> 00:10:22,472 the user doesn't really notice what's going on. 212 00:10:22,472 --> 00:10:24,430 At the end of the day, none of us humans really 213 00:10:24,430 --> 00:10:27,130 care what IP address we're actually going to if we visit 214 00:10:27,130 --> 00:10:29,410 Facebook.com or Gmail.com or the like. 215 00:10:29,410 --> 00:10:33,730 We just care that our computer can find that server or servers on the internet. 216 00:10:33,730 --> 00:10:38,020 So via DNS, we could, very cleverly, via this middleman here, 217 00:10:38,020 --> 00:10:42,010 which is really just going to be some third device, some separate server-- 218 00:10:42,010 --> 00:10:45,940 it, as a DNS device, could just respond to requests 219 00:10:45,940 --> 00:10:49,630 from customers with either this IP address or this IP address, 220 00:10:49,630 --> 00:10:52,840 or any number of different IP addresses. 221 00:10:52,840 --> 00:10:56,130 So does this solve the problem? 222 00:10:56,130 --> 00:10:58,020 Again, most everything in computer science 223 00:10:58,020 --> 00:11:01,200 would seem to be a tradeoff at the end of the day. 224 00:11:01,200 --> 00:11:03,690 And this seems almost too good to be true, perhaps. 225 00:11:03,690 --> 00:11:04,560 It's so simple. 226 00:11:04,560 --> 00:11:06,270 It leverages an existing technology. 227 00:11:06,270 --> 00:11:07,750 It just works. 228 00:11:07,750 --> 00:11:10,440 So what prices might we pay? 229 00:11:10,440 --> 00:11:14,325 Well, DNS, it turns out, gets cached quite a bit. 230 00:11:14,325 --> 00:11:15,450 And what does caching mean? 231 00:11:15,450 --> 00:11:18,626 Caching something means keeping some past answer-- 232 00:11:18,626 --> 00:11:20,750 or more generally, piece of information-- around so 233 00:11:20,750 --> 00:11:26,350 that you can access it more quickly the second and the third time and beyond. 234 00:11:26,350 --> 00:11:30,360 And so computers today, Macs and PCs, as well as 235 00:11:30,360 --> 00:11:33,480 servers on the internet, other DNS servers on the internet, 236 00:11:33,480 --> 00:11:37,170 for performance reasons, will often remember the responses 237 00:11:37,170 --> 00:11:38,910 that they get from DNS servers. 238 00:11:38,910 --> 00:11:42,690 For instance, if, on my Mac, I visit Facebook.com, hypothetically 239 00:11:42,690 --> 00:11:47,370 a lot of times during the day, it's kind of stupid if my laptop, again and again 240 00:11:47,370 --> 00:11:49,620 and again and again, asks some DNS server 241 00:11:49,620 --> 00:11:52,110 for Facebook.com's IP address if it already 242 00:11:52,110 --> 00:11:55,020 asked that same question an hour ago-- or more realistically, 243 00:11:55,020 --> 00:11:57,390 two minutes ago, or something like that. 244 00:11:57,390 --> 00:12:01,590 It would be smarter if my operating system-- or even my browser, Chrome 245 00:12:01,590 --> 00:12:03,540 or Firefox or whatever I'm using-- actually 246 00:12:03,540 --> 00:12:08,210 remembers that answer for me so that my computer can just pull up that web 247 00:12:08,210 --> 00:12:13,740 site faster by skipping a step, by not wasting time asking a server again 248 00:12:13,740 --> 00:12:15,750 for the IP address of a server. 249 00:12:15,750 --> 00:12:19,860 And after all, IP addresses, it turns out, generally don't change that often. 250 00:12:19,860 --> 00:12:23,880 It's certainly possible for a company or a university or even a home user 251 00:12:23,880 --> 00:12:25,840 to change their computer's IP addresses. 252 00:12:25,840 --> 00:12:28,350 But the reality is it doesn't change all that often. 253 00:12:28,350 --> 00:12:30,490 The common case is to have the same IP address 254 00:12:30,490 --> 00:12:33,960 now as you might an hour from now, or even a day or a week or a month 255 00:12:33,960 --> 00:12:34,860 from now. 256 00:12:34,860 --> 00:12:37,530 But the key thing is that it can change. 257 00:12:37,530 --> 00:12:41,600 And especially if you're worried about customers-- not just some personal web 258 00:12:41,600 --> 00:12:43,260 site, but you might lose business. 259 00:12:43,260 --> 00:12:45,960 You might lose orders if users can't visit your website. 260 00:12:45,960 --> 00:12:49,470 Anything that puts your server's uptime, so to speak-- 261 00:12:49,470 --> 00:12:51,480 being accessible on the internet at risk-- 262 00:12:51,480 --> 00:12:53,790 probably is worthy of some consideration. 263 00:12:53,790 --> 00:12:57,884 So let me propose, then, that just one of these servers goes offline somehow. 264 00:12:57,884 --> 00:12:58,800 Maybe it's deliberate. 265 00:12:58,800 --> 00:13:00,330 You need to do some service for it. 266 00:13:00,330 --> 00:13:03,270 Or maybe it crashed in some way, or it got unplugged somehow, 267 00:13:03,270 --> 00:13:07,530 or something went wrong such that now, one or more of your servers, 268 00:13:07,530 --> 00:13:11,940 across which you've been load balancing, no longer can talk to the internet. 269 00:13:11,940 --> 00:13:12,960 What might happen? 270 00:13:12,960 --> 00:13:15,510 Well, if some customer's Mac, like my own, 271 00:13:15,510 --> 00:13:20,310 has remembered or cached that particular server's IP address, 272 00:13:20,310 --> 00:13:21,870 that is not a good situation. 273 00:13:21,870 --> 00:13:24,150 Because your Mac or PC or whatever is going 274 00:13:24,150 --> 00:13:27,450 to now try to revisit your web site again and again 275 00:13:27,450 --> 00:13:33,660 and again at that old cached IP address that apparently can be a dead end. 276 00:13:33,660 --> 00:13:38,370 And so even though you still have servers that could potentially 277 00:13:38,370 --> 00:13:41,550 handle that customer's request, that customer's order, 278 00:13:41,550 --> 00:13:44,520 that customer's desire to check out, he or she 279 00:13:44,520 --> 00:13:46,650 really is still not going to be able to visit 280 00:13:46,650 --> 00:13:49,290 the website unless that cache expires. 281 00:13:49,290 --> 00:13:52,230 Maybe they reboot their computer so that the cache forcibly expires. 282 00:13:52,230 --> 00:13:55,140 Maybe they just wait some amount of time so that that IP address 283 00:13:55,140 --> 00:13:57,510 is forgotten by the browser or by the operating system 284 00:13:57,510 --> 00:14:01,950 or by some other DNS server until the new one's available IP 285 00:14:01,950 --> 00:14:03,690 addresses are picked up instead. 286 00:14:03,690 --> 00:14:04,830 But there is that risk. 287 00:14:04,830 --> 00:14:07,470 And I would argue that this risk is even higher especially 288 00:14:07,470 --> 00:14:11,910 for companies that might be considering moving their infrastructure from one 289 00:14:11,910 --> 00:14:13,200 service to another. 290 00:14:13,200 --> 00:14:16,890 If you're deliberately going to move your servers from one IP address 291 00:14:16,890 --> 00:14:20,010 to another, as might happen if you change cloud providers, so to speak-- 292 00:14:20,010 --> 00:14:21,240 more on those in a minute-- 293 00:14:21,240 --> 00:14:24,790 really, if you change the companies that you're using to host your servers, 294 00:14:24,790 --> 00:14:26,460 your IP addresses will change. 295 00:14:26,460 --> 00:14:29,310 And you certainly don't want to incur a huge amount of downtime 296 00:14:29,310 --> 00:14:30,550 in a situation like that. 297 00:14:30,550 --> 00:14:32,130 So there are these tradeoffs. 298 00:14:32,130 --> 00:14:35,040 Easy solution, technologically pretty inexpensive to do. 299 00:14:35,040 --> 00:14:37,840 It just works using existing technology. 300 00:14:37,840 --> 00:14:40,960 But you open up yourselves to this risk. 301 00:14:40,960 --> 00:14:42,300 So let's address that. 302 00:14:42,300 --> 00:14:44,837 Putting back the old proverbial engineering hat, 303 00:14:44,837 --> 00:14:46,170 let's try to solve this problem. 304 00:14:46,170 --> 00:14:48,870 It seems that giving a unique IP address to this server 305 00:14:48,870 --> 00:14:52,110 and to this server, and any number of other servers that are back there, 306 00:14:52,110 --> 00:14:56,400 might not be the smartest idea in so far as those IPs can get cached. 307 00:14:56,400 --> 00:15:01,120 So what if we use DNS as follows? 308 00:15:01,120 --> 00:15:05,220 When my laptop or anyone else's requests the IP address for www.something.com, 309 00:15:05,220 --> 00:15:08,860 why don't we return the IP address of this device here-- 310 00:15:08,860 --> 00:15:11,680 this load balancer, as we'll start calling it, 311 00:15:11,680 --> 00:15:15,360 where a load balancer is usually just a physical device, 312 00:15:15,360 --> 00:15:18,840 or multiple physical devices, whose purpose in life is to balance load? 313 00:15:18,840 --> 00:15:22,140 Packets come in, and similar in spirit to a router, 314 00:15:22,140 --> 00:15:25,860 they do route information to the left, to the right, or some other direction. 315 00:15:25,860 --> 00:15:29,910 But their overarching purpose isn't just to get data from point A to point B, 316 00:15:29,910 --> 00:15:32,970 but to somehow intelligently balance that traffic 317 00:15:32,970 --> 00:15:37,860 over multiple possible destinations for point B, identical servers 318 00:15:37,860 --> 00:15:39,390 in the case of our story here. 319 00:15:39,390 --> 00:15:42,810 So what if, instead, we addressed this problem of potential downtime 320 00:15:42,810 --> 00:15:46,830 by returning the IP address of the load balancer, 321 00:15:46,830 --> 00:15:49,560 and then, by nature of private IP addresses 322 00:15:49,560 --> 00:15:52,110 or some other mechanism that the end user does not 323 00:15:52,110 --> 00:15:56,920 need to know or care about, this load balancer somehow routes the traffic 324 00:15:56,920 --> 00:16:00,110 to either the first device or the second device, 325 00:16:00,110 --> 00:16:03,640 LB here being our load balancer? 326 00:16:03,640 --> 00:16:05,860 So we've seemed to have solved this problem. 327 00:16:05,860 --> 00:16:08,920 In so far as now we have configured our DNS servers 328 00:16:08,920 --> 00:16:12,250 to return the IP address of the load balancer, 329 00:16:12,250 --> 00:16:15,760 there's no problem of downtime as we described a moment ago. 330 00:16:15,760 --> 00:16:20,890 Because if Server 1 goes offline for whatever reason, no big deal. 331 00:16:20,890 --> 00:16:25,510 The load balancer should hopefully just notice that and subsequently start 332 00:16:25,510 --> 00:16:29,710 proactively routing all incoming data that reaches its IP address to Server 2 333 00:16:29,710 --> 00:16:30,940 and not Server 1. 334 00:16:30,940 --> 00:16:32,819 now how does the load balancer know? 335 00:16:32,819 --> 00:16:34,360 Well, either a human could intervene. 336 00:16:34,360 --> 00:16:37,240 Maybe someone gets a late night call or text or page saying, 337 00:16:37,240 --> 00:16:39,490 uh oh, server 1 is down, you better do something. 338 00:16:39,490 --> 00:16:42,070 And then he or she can manually configure the load balancer 339 00:16:42,070 --> 00:16:44,590 to no longer send any traffic to Server 1. 340 00:16:44,590 --> 00:16:47,620 That seems kind of stupid in an age of automation and smart software. 341 00:16:47,620 --> 00:16:48,730 Maybe we can do better. 342 00:16:48,730 --> 00:16:49,870 And indeed, we can. 343 00:16:49,870 --> 00:16:52,630 A technique that's often used by servers is 344 00:16:52,630 --> 00:16:55,570 something modeled from the human world to use 345 00:16:55,570 --> 00:16:58,690 what you might describe as heartbeats to actually configure 346 00:16:58,690 --> 00:17:02,350 the load balancer and Servers 1 and 2 to operate as follows. 347 00:17:02,350 --> 00:17:05,740 Maybe every second, every half a second, maybe every five seconds 348 00:17:05,740 --> 00:17:10,359 you configure Server 1 and Server 2 to send some kind of heartbeat message 349 00:17:10,359 --> 00:17:11,750 to the load balancer. 350 00:17:11,750 --> 00:17:14,770 This is just a TCP/IP packet, some kind of network packet 351 00:17:14,770 --> 00:17:17,650 that's the equivalent of saying I'm alive. 352 00:17:17,650 --> 00:17:18,339 I'm alive. 353 00:17:18,339 --> 00:17:23,770 Or more goofily, like boom, boom, boom, boom, ergo the heartbeat metaphor. 354 00:17:23,770 --> 00:17:26,680 But the point is that 1 and 2, and any number of other servers, 355 00:17:26,680 --> 00:17:29,710 should be configured to just constantly reassure 356 00:17:29,710 --> 00:17:31,800 the load balancer that they are alive. 357 00:17:31,800 --> 00:17:32,770 They are accessible. 358 00:17:32,770 --> 00:17:34,780 They are ready to receive traffic. 359 00:17:34,780 --> 00:17:36,809 And the load balancer, similarly-- 360 00:17:36,809 --> 00:17:39,100 and you might see where this is going-- can very simply 361 00:17:39,100 --> 00:17:42,100 be configured to listen for that heartbeat. 362 00:17:42,100 --> 00:17:46,060 And if it ever doesn't hear a heartbeat from Server 1 or Server 2, 363 00:17:46,060 --> 00:17:48,790 it should just assume that something is wrong. 364 00:17:48,790 --> 00:17:50,030 The server has died. 365 00:17:50,030 --> 00:17:50,830 It's gone offline. 366 00:17:50,830 --> 00:17:52,390 Something bad has happened. 367 00:17:52,390 --> 00:17:55,090 So the load balancer subsequently should simply not 368 00:17:55,090 --> 00:17:58,030 route any traffic to that particular server 369 00:17:58,030 --> 00:18:00,490 until some human or some automated process 370 00:18:00,490 --> 00:18:04,300 brings the server back alive, so to speak, and the heartbeat resumes. 371 00:18:04,300 --> 00:18:06,880 Now, of course, this problem doesn't go away permanently. 372 00:18:06,880 --> 00:18:09,630 If servers 1 and 2 stop emitting a heartbeat, 373 00:18:09,630 --> 00:18:11,447 we really have no capacity for users. 374 00:18:11,447 --> 00:18:13,030 But that would be an extreme scenario. 375 00:18:13,030 --> 00:18:17,714 Hopefully it's just one or a few of our servers go offline in that way. 376 00:18:17,714 --> 00:18:20,380 So we can configure our servers for these heartbeats, which is-- 377 00:18:20,380 --> 00:18:21,190 think about it-- 378 00:18:21,190 --> 00:18:25,100 a very simple physiologically-inspired solution to a problem. 379 00:18:25,100 --> 00:18:27,730 And even if it's not obvious how you implemented it in code, 380 00:18:27,730 --> 00:18:30,370 it really is just an algorithm, a simple set of instructions 381 00:18:30,370 --> 00:18:33,700 with which we can solve this problem. 382 00:18:33,700 --> 00:18:36,690 And yet, damnit, we've introduced a new problem. 383 00:18:36,690 --> 00:18:40,050 And so this really is the old leaky hose, 384 00:18:40,050 --> 00:18:43,150 where just as we've plugged one leak or solved one problem, 385 00:18:43,150 --> 00:18:46,420 another one has sprung up somewhere else along the line. 386 00:18:46,420 --> 00:18:48,850 So what's the problem now? 387 00:18:48,850 --> 00:18:49,870 What's the problem now? 388 00:18:49,870 --> 00:18:54,520 The whole motivation of introducing Server Number 2, in addition 389 00:18:54,520 --> 00:18:58,930 to Server Number 1, was to make sure that we have enough capacity, 390 00:18:58,930 --> 00:19:03,070 and better yet, to make sure that if Server 1 or Server 2 goes offline, 391 00:19:03,070 --> 00:19:05,650 the other one can hopefully pick up the load unless it's 392 00:19:05,650 --> 00:19:09,320 a super busy time with lots and lots of users visiting all at once. 393 00:19:09,320 --> 00:19:12,070 So in fact, the general idea at play here 394 00:19:12,070 --> 00:19:20,770 is high availability ensuring that if one server goes down, 395 00:19:20,770 --> 00:19:22,930 you have other servers that can pick up the load. 396 00:19:22,930 --> 00:19:26,419 Being highly available means you can be tolerant to issues like that. 397 00:19:26,419 --> 00:19:28,210 And then load balancing, of course, is just 398 00:19:28,210 --> 00:19:31,390 the mere process of splitting the load across those two endpoints. 399 00:19:31,390 --> 00:19:34,390 But we have introduced another problem. 400 00:19:34,390 --> 00:19:44,710 This might be abbreviated SPOF, or more explicitly, Single Point Of Failure. 401 00:19:44,710 --> 00:19:48,040 Just as I've solved one problem by introducing this load balancer, 402 00:19:48,040 --> 00:19:50,800 so have I introduced a new problem, which is this. 403 00:19:50,800 --> 00:19:52,870 There is now, as you might infer from the name 404 00:19:52,870 --> 00:19:54,730 alone, a single point of failure. 405 00:19:54,730 --> 00:19:59,050 It's fine that I can now tolerate Server 1 or Server 2 going down, 406 00:19:59,050 --> 00:20:01,920 but what can I not tolerate, clearly? 407 00:20:01,920 --> 00:20:04,430 What if the load balancer goes down? 408 00:20:04,430 --> 00:20:06,160 So this is a very real concern. 409 00:20:06,160 --> 00:20:09,071 Maybe the load balancer itself gets overloaded. 410 00:20:09,071 --> 00:20:11,320 Maybe the load balancer itself has some kind of issue. 411 00:20:11,320 --> 00:20:13,270 And if the load balancer goes down, it doesn't 412 00:20:13,270 --> 00:20:16,120 matter how many web servers I have down here, 413 00:20:16,120 --> 00:20:19,900 or how much money I've spent down here to ensure my high availability. 414 00:20:19,900 --> 00:20:25,100 My server is offline if this single point of failure indeed fails. 415 00:20:25,100 --> 00:20:27,640 Now, you'd like to think that the load balancer-- 416 00:20:27,640 --> 00:20:30,070 especially since it only has one job in life-- 417 00:20:30,070 --> 00:20:33,270 can at least handle more traffic than any individual server. 418 00:20:33,270 --> 00:20:36,490 Indeed, clearly, it must be the case that the load balancer 419 00:20:36,490 --> 00:20:39,490 is fast enough and capable enough to handle twice as 420 00:20:39,490 --> 00:20:41,840 much traffic as any individual server. 421 00:20:41,840 --> 00:20:46,999 But that's generally accepted as feasible insofar as your website. 422 00:20:46,999 --> 00:20:48,790 Your real intellectual property is probably 423 00:20:48,790 --> 00:20:50,590 doing a lot of work-- talking to a database, 424 00:20:50,590 --> 00:20:53,506 writing out files, downloading things, or any number of other features 425 00:20:53,506 --> 00:20:56,800 that just take more effort than just routing data from one server 426 00:20:56,800 --> 00:20:58,990 to another as a load balancer does. 427 00:20:58,990 --> 00:21:00,970 But it doesn't matter how performant it is. 428 00:21:00,970 --> 00:21:04,240 If the load balancer breaks, goes offline for some reason, 429 00:21:04,240 --> 00:21:08,120 your entire infrastructure is inaccessible. 430 00:21:08,120 --> 00:21:09,980 So how do we solve this? 431 00:21:09,980 --> 00:21:13,510 How do we go about and architect a solution to this? 432 00:21:13,510 --> 00:21:15,910 Well, how did we address this issue earlier? 433 00:21:15,910 --> 00:21:20,260 We addressed the issue of insufficient capacity or potential downtime 434 00:21:20,260 --> 00:21:22,960 by just throwing hardware at the problem. 435 00:21:22,960 --> 00:21:25,940 And so maybe we could do that same thing here. 436 00:21:25,940 --> 00:21:29,260 Maybe we could just introduce a second load balancer. 437 00:21:29,260 --> 00:21:31,540 I'll call this LB as well. 438 00:21:31,540 --> 00:21:33,940 And now we somehow have to-- 439 00:21:33,940 --> 00:21:39,640 I feel like we're just endlessly going to be adding more and more rectangles 440 00:21:39,640 --> 00:21:40,540 to the picture. 441 00:21:40,540 --> 00:21:46,480 But somehow, we need to be able to load balance across now two servers and two 442 00:21:46,480 --> 00:21:47,980 load balancers. 443 00:21:47,980 --> 00:21:48,860 So how do we do this? 444 00:21:48,860 --> 00:21:52,660 Well, let me clean this up so that we have a bit more room to play with here 445 00:21:52,660 --> 00:21:57,260 and consider how a pair of load balancers might actually work. 446 00:21:57,260 --> 00:22:01,510 So if my first server is here and my second server is here, 447 00:22:01,510 --> 00:22:07,720 and I'm proposing now to have two load balancers-- one here and one here-- 448 00:22:07,720 --> 00:22:12,460 surely, both of these have to be able to talk to both servers. 449 00:22:12,460 --> 00:22:15,100 So we already have this necessity. 450 00:22:15,100 --> 00:22:18,820 And somehow, traffic has to come from the internet 451 00:22:18,820 --> 00:22:23,497 into this set of load balancers, but probably only to one, 452 00:22:23,497 --> 00:22:25,330 because we don't want to solve this with DNS 453 00:22:25,330 --> 00:22:27,370 and just have two IP addresses out there. 454 00:22:27,370 --> 00:22:30,160 Because if one breaks, we can recreate the same problem 455 00:22:30,160 --> 00:22:32,090 as before if we're not careful. 456 00:22:32,090 --> 00:22:33,140 So what if we do this? 457 00:22:33,140 --> 00:22:37,390 What if we use this building block of heartbeats in another way as well? 458 00:22:37,390 --> 00:22:40,600 What if we ensure that our load balancers-- 459 00:22:40,600 --> 00:22:45,740 plural-- have just one IP address, which a moment ago seemed 460 00:22:45,740 --> 00:22:47,240 to create a single point of failure? 461 00:22:47,240 --> 00:22:48,590 But what if we do this? 462 00:22:48,590 --> 00:22:52,330 What if we also allow the load balancers to talk to, 463 00:22:52,330 --> 00:22:57,940 to communicate over a network with each other so that one of the load balancers 464 00:22:57,940 --> 00:23:00,940 is constantly saying to the other, I'm alive. 465 00:23:00,940 --> 00:23:02,020 I'm alive. 466 00:23:02,020 --> 00:23:03,290 I'm alive. 467 00:23:03,290 --> 00:23:06,310 And so what the load balancers could be configured to do 468 00:23:06,310 --> 00:23:10,400 is that only one of them operates at any given point in time. 469 00:23:10,400 --> 00:23:14,830 But if the other server, the other load balancer, 470 00:23:14,830 --> 00:23:19,330 no longer hears from that primary load balancer because of the heartbeats 471 00:23:19,330 --> 00:23:21,790 that are ideally both being emitted in both directions 472 00:23:21,790 --> 00:23:25,150 so that they can both be assured of the other's up time-- 473 00:23:25,150 --> 00:23:29,110 if the secondary load balancer stops hearing the primary load balancer, 474 00:23:29,110 --> 00:23:32,560 the secondary load balancer can just presumptuously 475 00:23:32,560 --> 00:23:37,050 reconfigure itself to take on that one and only IP address, 476 00:23:37,050 --> 00:23:39,760 effectively assuming that the first load balancer is not going 477 00:23:39,760 --> 00:23:41,740 to be responding to any traffic anyway. 478 00:23:41,740 --> 00:23:46,120 And the second load balancer can simply take on the entire load itself. 479 00:23:46,120 --> 00:23:49,750 But the key difference now in this particular solution 480 00:23:49,750 --> 00:23:53,920 is that there's only one IP address that describes this whole architecture, only 481 00:23:53,920 --> 00:23:56,740 one IP address between the two load balancers 482 00:23:56,740 --> 00:24:01,210 so we don't risk those potential dead ends that we had a little bit ago 483 00:24:01,210 --> 00:24:03,710 with our back end servers. 484 00:24:03,710 --> 00:24:08,884 So now it's starting to get more robust, more highly available. 485 00:24:08,884 --> 00:24:09,800 So that's pretty good. 486 00:24:09,800 --> 00:24:11,800 We've solved most of these problems. 487 00:24:11,800 --> 00:24:17,590 We've generously, though, swept one problem underneath the rug, whereby 488 00:24:17,590 --> 00:24:20,217 every time I draw another rectangle-- 489 00:24:20,217 --> 00:24:22,300 not just the first time, but now the second time-- 490 00:24:22,300 --> 00:24:26,200 and add some interconnectivity, somehow, among them someone 491 00:24:26,200 --> 00:24:27,850 somewhere is spending some money. 492 00:24:27,850 --> 00:24:30,340 And indeed, I am solving these problems thus far 493 00:24:30,340 --> 00:24:33,800 by throwing money at the problem, and frankly introducing complexity. 494 00:24:33,800 --> 00:24:36,400 Already look at how many arrows or edges there 495 00:24:36,400 --> 00:24:40,060 are now, which might simply refer to physical wires, which is fine. 496 00:24:40,060 --> 00:24:43,990 But there's also a logical configuration that's now necessary. 497 00:24:43,990 --> 00:24:47,530 And God forbid we have a third load balancer for extra high availability 498 00:24:47,530 --> 00:24:49,420 or any number of servers here-- 499 00:24:49,420 --> 00:24:52,510 13 or 20 or 100 or 1,000 servers. 500 00:24:52,510 --> 00:24:54,910 It's a lot of cross-connections-- not just physically, 501 00:24:54,910 --> 00:24:58,120 but logically in terms of the requisite configuration. 502 00:24:58,120 --> 00:25:01,540 So this complexity does add up. 503 00:25:01,540 --> 00:25:04,690 And the cost certainly adds up. 504 00:25:04,690 --> 00:25:07,240 And now, once upon a time-- and not all that 505 00:25:07,240 --> 00:25:11,750 long ago-- if a company wanted to architect this kind of solution, 506 00:25:11,750 --> 00:25:14,590 you would literally buy two load balancers, 507 00:25:14,590 --> 00:25:17,290 and you would buy two or more web servers, 508 00:25:17,290 --> 00:25:19,690 and you would buy the requisite physical ethernet 509 00:25:19,690 --> 00:25:21,070 cables to interconnect the two. 510 00:25:21,070 --> 00:25:23,320 And you'd probably buy a whole bunch of other hardware 511 00:25:23,320 --> 00:25:26,279 that we've not even talked about, like firewalls and switches and more. 512 00:25:26,279 --> 00:25:28,361 But you would physically buy all of this hardware. 513 00:25:28,361 --> 00:25:30,520 You would physically connect all of this hardware 514 00:25:30,520 --> 00:25:35,710 and configure it to implement these several kinds of features. 515 00:25:35,710 --> 00:25:38,410 But the catch is that the more and more hardware 516 00:25:38,410 --> 00:25:42,310 you buy, just probabilistically, the more and more you 517 00:25:42,310 --> 00:25:44,320 invite some kind of failure. 518 00:25:44,320 --> 00:25:46,000 Maybe it's some stupid human error. 519 00:25:46,000 --> 00:25:49,310 But more realistically, one of your hard drives is going to fail. 520 00:25:49,310 --> 00:25:52,960 And hard drives are typically rated for the enterprise in terms of Mean Time 521 00:25:52,960 --> 00:25:57,230 Between Failure, MTBF, which generally means 522 00:25:57,230 --> 00:26:01,100 how long should you expect a hard drive to work on average before it fails. 523 00:26:01,100 --> 00:26:01,600 It breaks. 524 00:26:01,600 --> 00:26:02,900 It just stops working. 525 00:26:02,900 --> 00:26:05,500 So if you have a whole bunch of servers, each of which 526 00:26:05,500 --> 00:26:08,390 has a whole bunch of hard drives, at some point, 527 00:26:08,390 --> 00:26:11,862 combinatorially, one or more of those drives is just going to fail, 528 00:26:11,862 --> 00:26:13,820 which is to say you're going to have a problem, 529 00:26:13,820 --> 00:26:15,850 and you're going to have to fix it yourself. 530 00:26:15,850 --> 00:26:19,940 At some point, too, you're going to run out of physical space. 531 00:26:19,940 --> 00:26:22,850 In fact, perhaps one of the most constraining resources, 532 00:26:22,850 --> 00:26:25,530 especially for startups, is the physical space itself. 533 00:26:25,530 --> 00:26:28,780 You probably don't want to start housing your servers in your physical office, 534 00:26:28,780 --> 00:26:32,530 because you need a special room for it, typically, with enough cooling, 535 00:26:32,530 --> 00:26:36,750 with enough access, with enough electricity, and enough humans 536 00:26:36,750 --> 00:26:37,750 to actually maintain it. 537 00:26:37,750 --> 00:26:41,345 Or you graduate from your own office space and go to a data center, 538 00:26:41,345 --> 00:26:44,230 a co-location facility, whereby you maybe 539 00:26:44,230 --> 00:26:47,500 rent space in a physical cage with a locking door, 540 00:26:47,500 --> 00:26:49,690 inside of which you put racks of servers, 541 00:26:49,690 --> 00:26:54,100 just racked up on big metal poles, and you pack as many servers in there 542 00:26:54,100 --> 00:26:54,970 as you can. 543 00:26:54,970 --> 00:26:57,610 But at some point, you're going to be bumping up 544 00:26:57,610 --> 00:27:03,340 against other constrained resources-- physical space, actual power capacity, 545 00:27:03,340 --> 00:27:07,220 cooling, as well as the humans to actually run this. 546 00:27:07,220 --> 00:27:10,720 And so very quickly does operations, ops, 547 00:27:10,720 --> 00:27:14,890 so to speak, become an increasing cost and an increasing challenge. 548 00:27:14,890 --> 00:27:19,130 And one of the most alluring features of the cloud, so to speak, 549 00:27:19,130 --> 00:27:23,350 is that you can move all of these details off-site. 550 00:27:23,350 --> 00:27:28,710 And you can abstract many of these, let's say, implementation details 551 00:27:28,710 --> 00:27:32,770 away whereby you yourself don't have to worry about the physical wires. 552 00:27:32,770 --> 00:27:35,260 You don't have to worry about the make and model of servers 553 00:27:35,260 --> 00:27:36,051 that you're buying. 554 00:27:36,051 --> 00:27:39,700 You don't have to worry about things actually breaking, 555 00:27:39,700 --> 00:27:42,790 because someone else will deal with that for you. 556 00:27:42,790 --> 00:27:46,000 But you have to still understand the topology and the architecture 557 00:27:46,000 --> 00:27:51,080 and the features that you want to implement so that you can actually 558 00:27:51,080 --> 00:27:53,130 configure them in the cloud. 559 00:27:53,130 --> 00:27:55,830 So what do you actually get from cloud providers? 560 00:27:55,830 --> 00:27:57,830 There's any number of them out there these days. 561 00:27:57,830 --> 00:28:01,760 But perhaps three of the biggest are Amazon, Google, and Microsoft, 562 00:28:01,760 --> 00:28:05,087 all of whom offer, these days, of very similar palettes of options. 563 00:28:05,087 --> 00:28:06,920 And it's outright overwhelming, if you visit 564 00:28:06,920 --> 00:28:10,290 each of their web sites, just how many cloud products they offer. 565 00:28:10,290 --> 00:28:13,610 But they would generally offer a number of standard products 566 00:28:13,610 --> 00:28:16,740 in the cloud-- for instance, a virtualized server. 567 00:28:16,740 --> 00:28:19,430 So you don't have to physically buy a server these days 568 00:28:19,430 --> 00:28:22,970 and plug it into your own ethernet connection, your own internet 569 00:28:22,970 --> 00:28:24,470 connection in your own office. 570 00:28:24,470 --> 00:28:27,320 You can instead essentially rent a server 571 00:28:27,320 --> 00:28:29,550 in the cloud, which is to say that Amazon, Google, 572 00:28:29,550 --> 00:28:31,520 Microsoft, or any number of other companies 573 00:28:31,520 --> 00:28:34,850 will host that server physically for you, 574 00:28:34,850 --> 00:28:37,520 and they will take care of the issues of power and cooling. 575 00:28:37,520 --> 00:28:40,061 And if a hard drive fails, they will go remove the old one 576 00:28:40,061 --> 00:28:41,060 and plug in the new one. 577 00:28:41,060 --> 00:28:43,610 And ideally, they will provide you with backup services. 578 00:28:43,610 --> 00:28:46,970 But more sophisticated than that, they can also 579 00:28:46,970 --> 00:28:52,280 help us recreate, in software, this kind of topology. 580 00:28:52,280 --> 00:28:56,750 In other words, even without having a human physically wire together 581 00:28:56,750 --> 00:28:59,390 this kind of graph, so to speak, that we've been building up 582 00:28:59,390 --> 00:29:02,600 here logically, thanks to software these days, 583 00:29:02,600 --> 00:29:06,260 you can implement this whole paradigm-- 584 00:29:06,260 --> 00:29:08,720 not with physical cables, not with physical devices, 585 00:29:08,720 --> 00:29:10,610 but with software virtually. 586 00:29:10,610 --> 00:29:11,460 What does that mean? 587 00:29:11,460 --> 00:29:13,640 It means that humans, over the past several years, 588 00:29:13,640 --> 00:29:17,810 have been writing software that mimics the behavior of physical servers. 589 00:29:17,810 --> 00:29:21,290 Humans have been writing software that mimics the behavior of a router. 590 00:29:21,290 --> 00:29:26,060 Humans have been writing software that mimics the behavior of a load balancer. 591 00:29:26,060 --> 00:29:30,470 And implementing mimics the behavior of-- really, we're just building, 592 00:29:30,470 --> 00:29:34,940 in software, what historically might have been implemented entirely 593 00:29:34,940 --> 00:29:35,682 in hardware. 594 00:29:35,682 --> 00:29:37,640 And even that's a bit of an oversimplification. 595 00:29:37,640 --> 00:29:40,077 Because even when something is bought as hardware, 596 00:29:40,077 --> 00:29:43,160 there is, of course, software running on that hardware that actually makes 597 00:29:43,160 --> 00:29:44,360 it do something. 598 00:29:44,360 --> 00:29:46,730 But they're no longer dedicated devices. 599 00:29:46,730 --> 00:29:50,750 You can use generic commodity PC server hardware, really, 600 00:29:50,750 --> 00:29:54,740 and transform that hardware into a certain role, a back end web 601 00:29:54,740 --> 00:29:58,550 server, a back end database, a load balancer, a router, a switch, 602 00:29:58,550 --> 00:30:00,180 any number of other things. 603 00:30:00,180 --> 00:30:02,930 And so what you were getting from companies like Amazon and Google 604 00:30:02,930 --> 00:30:06,230 and Microsoft and more is the ability to build up 605 00:30:06,230 --> 00:30:09,050 your infrastructure in software. 606 00:30:09,050 --> 00:30:16,190 In fact, the buzzword here, the acronym, is IaaS, Infrastructure as a Service. 607 00:30:16,190 --> 00:30:19,910 So you sign up for an account on any of those companies' cloud services web 608 00:30:19,910 --> 00:30:23,030 sites, and you put in your credit card information or your invoicing 609 00:30:23,030 --> 00:30:26,780 information, and you literally, via a command line tool-- so a keyboard, 610 00:30:26,780 --> 00:30:30,380 or via a nice, web-based graphical user interface, GUI-- 611 00:30:30,380 --> 00:30:34,907 do you point and click and say, give me two servers and one load balancer. 612 00:30:34,907 --> 00:30:36,740 Or if you have enough money in the bank, you 613 00:30:36,740 --> 00:30:40,040 say give me two servers and two load balancers 614 00:30:40,040 --> 00:30:41,990 configured for high availability. 615 00:30:41,990 --> 00:30:44,360 Or better yet, you don't say any of that. 616 00:30:44,360 --> 00:30:48,410 You just tell the provider, give me a web server and give me a load balancer, 617 00:30:48,410 --> 00:30:52,710 and you deal with the process of scaling those things as needed. 618 00:30:52,710 --> 00:30:56,360 In fact, a buzzword de jeur is auto scaling, which refers to a feature, 619 00:30:56,360 --> 00:30:59,720 implemented in software, whereby if a cloud 620 00:30:59,720 --> 00:31:03,740 provider notices that your servers are getting a lot of traffic-- 621 00:31:03,740 --> 00:31:05,880 business is good, or it's the holiday season, 622 00:31:05,880 --> 00:31:10,220 and you are bumping up against just how many users your one or two 623 00:31:10,220 --> 00:31:12,230 or three or more servers can handle-- 624 00:31:12,230 --> 00:31:17,030 auto-scaling is a feature that will enable the cloud provider to just turn 625 00:31:17,030 --> 00:31:21,320 on, virtually, more servers for you so that you go from two to three 626 00:31:21,320 --> 00:31:21,980 automatically. 627 00:31:21,980 --> 00:31:25,770 You can be happily asleep in the middle of the night, 628 00:31:25,770 --> 00:31:28,780 and even though your traffic is peaking, it doesn't matter. 629 00:31:28,780 --> 00:31:30,830 Your architecture is going to auto scale. 630 00:31:30,830 --> 00:31:33,320 And better yet-- especially financially-- 631 00:31:33,320 --> 00:31:37,820 if the cloud provider notices, maybe 12 hours later-- oh, all of your customers 632 00:31:37,820 --> 00:31:41,270 have gone to sleep, we don't really need all of this excess capacity. 633 00:31:41,270 --> 00:31:43,400 Or maybe the holidays are now in the past. 634 00:31:43,400 --> 00:31:45,200 You really don't need this excess capacity. 635 00:31:45,200 --> 00:31:50,090 Auto scaling also dictates that those servers can be virtually turned off. 636 00:31:50,090 --> 00:31:51,435 So you're no longer using them. 637 00:31:51,435 --> 00:31:53,060 You're no longer load bouncing to them. 638 00:31:53,060 --> 00:31:56,310 And most importantly, you're no longer paying for them. 639 00:31:56,310 --> 00:32:00,010 So this is a really, really nice value add at this point. 640 00:32:00,010 --> 00:32:02,780 There's no human crawling around on the floor rewiring things 641 00:32:02,780 --> 00:32:04,130 and plugging in new servers. 642 00:32:04,130 --> 00:32:07,460 There's no finance person having to approve the PO to actually order more 643 00:32:07,460 --> 00:32:09,220 servers just to increase your capacity. 644 00:32:09,220 --> 00:32:13,310 And most importantly, there is no latency between the time when 645 00:32:13,310 --> 00:32:16,580 you notice, oh, my god, we're getting really successful 646 00:32:16,580 --> 00:32:18,527 and can't handle our load-- uh oh. 647 00:32:18,527 --> 00:32:20,360 It's going to be a two, three-week lead time 648 00:32:20,360 --> 00:32:22,460 before we can even get in the more servers. 649 00:32:22,460 --> 00:32:25,910 Thanks to cloud computing, you can literally log in to Amazon's, Google's, 650 00:32:25,910 --> 00:32:28,210 Microsoft's web site and, click, click, click, 651 00:32:28,210 --> 00:32:32,360 have more server capacity within seconds, within minutes, 652 00:32:32,360 --> 00:32:37,790 far faster than the physical world traditionally allowed. 653 00:32:37,790 --> 00:32:40,460 So those are just some of the features now 654 00:32:40,460 --> 00:32:45,320 that we gain from outsourcing to the so-called cloud. 655 00:32:45,320 --> 00:32:48,510 So where does some of this capability come from? 656 00:32:48,510 --> 00:32:51,960 Well, it turns out that over the past many years, 657 00:32:51,960 --> 00:32:54,230 humans have been getting better and better and better 658 00:32:54,230 --> 00:32:58,460 at packing more physical hardware into the same form factor, 659 00:32:58,460 --> 00:32:59,720 into the same physical space. 660 00:32:59,720 --> 00:33:02,420 So at the level of CPUs, the brains of a computer, 661 00:33:02,420 --> 00:33:05,810 we humans have gotten much better at packing more and more transistors, 662 00:33:05,810 --> 00:33:07,520 for instance, onto a CPU. 663 00:33:07,520 --> 00:33:11,070 And transistors are the little switches that can turn things on and off-- 664 00:33:11,070 --> 00:33:12,670 0 and 1, 1 and 0. 665 00:33:12,670 --> 00:33:14,570 So you can store more information and you 666 00:33:14,570 --> 00:33:17,260 can do more with that information more quickly. 667 00:33:17,260 --> 00:33:19,820 CPUs today also have more cores, which you 668 00:33:19,820 --> 00:33:23,180 can think of as mini CPUs inside of the main CPU, 669 00:33:23,180 --> 00:33:25,550 so that a computer with multiple cores can literally 670 00:33:25,550 --> 00:33:28,280 do multiple things at a time. 671 00:33:28,280 --> 00:33:32,060 But the funny thing is that we humans, over the past decade or two, 672 00:33:32,060 --> 00:33:35,150 really haven't been getting fundamentally faster at life. 673 00:33:35,150 --> 00:33:38,090 At the end of the day, I can only check my email so quickly. 674 00:33:38,090 --> 00:33:39,990 I can only post on Facebook so quickly. 675 00:33:39,990 --> 00:33:43,670 I can only check out from Amazon so quickly. 676 00:33:43,670 --> 00:33:47,784 Because we humans have, of course, a finite speed to ourselves. 677 00:33:47,784 --> 00:33:49,950 We're not just getting-- we're not doubling in speed 678 00:33:49,950 --> 00:33:52,790 a la Moore's law every year or two. 679 00:33:52,790 --> 00:33:57,140 So we have, it would seem, a lot of excess computing capacity these days. 680 00:33:57,140 --> 00:34:00,080 Computers are getting so darn fast, we don't necessarily 681 00:34:00,080 --> 00:34:03,830 know what to do with all of these CPU cycles and with all of the RAM 682 00:34:03,830 --> 00:34:06,860 that we can fit into the same physical box at half the price 683 00:34:06,860 --> 00:34:08,989 that it cost us last year. 684 00:34:08,989 --> 00:34:12,469 And so manufacturers and companies realize 685 00:34:12,469 --> 00:34:17,179 that we could actually build a business on this increased capacity. 686 00:34:17,179 --> 00:34:23,277 We can implement the computer equivalent of timesharing, so to speak, 687 00:34:23,277 --> 00:34:25,610 which has long been with us in the history of computing. 688 00:34:25,610 --> 00:34:27,620 But we can do this on a much more massive scale 689 00:34:27,620 --> 00:34:33,679 now by taking one physical server that has maybe two CPUs, or 16 CPUs, 690 00:34:33,679 --> 00:34:38,570 or 64 CPUs, and maybe gigabytes-- 691 00:34:38,570 --> 00:34:41,090 tens of gigabytes or hundreds of gigabytes of RAM-- 692 00:34:41,090 --> 00:34:45,110 all inside of the same physical device, plug it in to an internet connection, 693 00:34:45,110 --> 00:34:50,810 and then run special software on that one server that creates the illusion 694 00:34:50,810 --> 00:34:54,920 that there's multiple servers living inside of that box. 695 00:34:54,920 --> 00:34:58,850 And this virtualization software is implemented 696 00:34:58,850 --> 00:35:02,480 by way of software called a virtual machine, or virtual machine monitor, 697 00:35:02,480 --> 00:35:04,430 or another word might be hypervisor. 698 00:35:04,430 --> 00:35:07,190 There's different ways to describe essentially the same thing. 699 00:35:07,190 --> 00:35:11,390 But a virtual machine is a piece of software 700 00:35:11,390 --> 00:35:15,590 running on a computer inside of which is running some other operating system, 701 00:35:15,590 --> 00:35:16,370 typically. 702 00:35:16,370 --> 00:35:19,070 So you might have one server running Windows. 703 00:35:19,070 --> 00:35:24,620 But inside of that server are multiple virtual machines, each of which 704 00:35:24,620 --> 00:35:25,880 itself is running Windows. 705 00:35:25,880 --> 00:35:29,509 So you might be able to chop up one computer into 10, or even into 100. 706 00:35:29,509 --> 00:35:31,550 Or perhaps more commonly, you might have a server 707 00:35:31,550 --> 00:35:34,340 running Linux or some Unix-based operating system, 708 00:35:34,340 --> 00:35:35,930 also with virtual machines on it. 709 00:35:35,930 --> 00:35:37,721 But those virtual machines might be running 710 00:35:37,721 --> 00:35:42,797 Linux themselves, or Unix, or Windows, or any number of versions of Windows. 711 00:35:42,797 --> 00:35:43,880 And so this is the beauty. 712 00:35:43,880 --> 00:35:48,080 When you have so much excess capacity and so many 713 00:35:48,080 --> 00:35:50,150 available CPU cycles and so much RAM, you 714 00:35:50,150 --> 00:35:56,490 can slice that up and then sell portions of the server's capacity to customers. 715 00:35:56,490 --> 00:36:01,310 And if you're really clever, you might look at your customers' usage patterns 716 00:36:01,310 --> 00:36:05,810 and realize that, you know what, it's not necessarily 717 00:36:05,810 --> 00:36:11,360 as simple as just taking my server and dividing it up into n different slices, 718 00:36:11,360 --> 00:36:13,700 where n is a generic variable for number, 719 00:36:13,700 --> 00:36:17,824 and then selling it or renting that space, really, to end customers. 720 00:36:17,824 --> 00:36:18,740 Because you know what? 721 00:36:18,740 --> 00:36:21,800 Some of those customers might have some booming businesses, which is great. 722 00:36:21,800 --> 00:36:24,110 But some of those customers might not have many users. 723 00:36:24,110 --> 00:36:26,129 Maybe it's a few dozen. 724 00:36:26,129 --> 00:36:27,170 Maybe it's a few hundred. 725 00:36:27,170 --> 00:36:29,280 But it's really a drop in the bucket. 726 00:36:29,280 --> 00:36:34,580 So instead of selling my computing resources to just end customers, 727 00:36:34,580 --> 00:36:37,730 maybe I'll sell it to twice as many customers or three times 728 00:36:37,730 --> 00:36:41,690 as many customers, and essentially over-sell my server's capacity, 729 00:36:41,690 --> 00:36:44,480 but expect that on average, this is just going 730 00:36:44,480 --> 00:36:47,570 to work out because some customers will be using a lot of those cycles 731 00:36:47,570 --> 00:36:49,460 because business is good, and some won't be, 732 00:36:49,460 --> 00:36:51,501 because it's just they don't have many customers, 733 00:36:51,501 --> 00:36:55,580 or really, it's a personal website that doesn't get much usage anyway. 734 00:36:55,580 --> 00:36:58,490 And so for some time, there has, of course, 735 00:36:58,490 --> 00:37:01,700 been this risk, when you sign up for a web hosting company or a cloud 736 00:37:01,700 --> 00:37:05,450 provider, that your web site actually might get really slow for reasons 737 00:37:05,450 --> 00:37:07,020 outside of your control. 738 00:37:07,020 --> 00:37:11,780 If you are co-located on a server that some other booming business is on, 739 00:37:11,780 --> 00:37:17,120 your users might actually suffer if your web host has oversold itself. 740 00:37:17,120 --> 00:37:19,249 And so in fact, this is one of those situations 741 00:37:19,249 --> 00:37:20,540 where you get what you pay for. 742 00:37:20,540 --> 00:37:23,990 If you're googling around and finding various cloud providers, 743 00:37:23,990 --> 00:37:26,360 or web hosting companies more specifically, 744 00:37:26,360 --> 00:37:30,050 you might be able to find a deal, like $10 per month or $50 per month, 745 00:37:30,050 --> 00:37:33,590 as opposed to $100 or $200 or more per month. 746 00:37:33,590 --> 00:37:37,070 And you do get what you pay for, because those fly-by-night operations that 747 00:37:37,070 --> 00:37:41,960 are selling you space and capacity super cheaply probably 748 00:37:41,960 --> 00:37:44,000 are overselling and over-committing. 749 00:37:44,000 --> 00:37:45,740 So these are the trade-offs, too-- 750 00:37:45,740 --> 00:37:48,590 how much money do you want to save versus how much risk 751 00:37:48,590 --> 00:37:50,300 do you actually want to take on? 752 00:37:50,300 --> 00:37:53,549 Generally, it's safer to go with some of the bigger fish these days, certainly 753 00:37:53,549 --> 00:37:57,500 when building a business, as you might on a company like Amazon or Google 754 00:37:57,500 --> 00:38:00,300 or Microsoft or derivatives thereof. 755 00:38:00,300 --> 00:38:02,540 So just to paint a more concrete technical picture 756 00:38:02,540 --> 00:38:06,534 of what virtualization is, here's a picture, as you might think of it. 757 00:38:06,534 --> 00:38:08,450 So you have your physical infrastructure here. 758 00:38:08,450 --> 00:38:12,080 So that's the actual server from Dell or IBM or whoever. 759 00:38:12,080 --> 00:38:14,870 Then you have the host operating system, which might be Windows, 760 00:38:14,870 --> 00:38:18,896 but is often Linux or some variant of Unix instead. 761 00:38:18,896 --> 00:38:20,270 And then you have the hypervisor. 762 00:38:20,270 --> 00:38:22,940 This is the piece of software that you install 763 00:38:22,940 --> 00:38:27,800 on your server that allows you to run multiple virtual machines on top of it. 764 00:38:27,800 --> 00:38:29,995 And those virtual machines can each run any number 765 00:38:29,995 --> 00:38:32,870 of different operating systems themselves, or even different versions 766 00:38:32,870 --> 00:38:34,130 of operating systems. 767 00:38:34,130 --> 00:38:38,359 And so depicted here up top are the disparate guest OS operating 768 00:38:38,359 --> 00:38:39,650 systems that might be on there. 769 00:38:39,650 --> 00:38:42,670 Maybe this is Linux and Solaris, and this is Windows itself, 770 00:38:42,670 --> 00:38:44,170 or any number of other combinations. 771 00:38:44,170 --> 00:38:47,870 Whatever your customers want or whatever you want to provide or essentially 772 00:38:47,870 --> 00:38:50,810 rent to customers, you can install. 773 00:38:50,810 --> 00:38:52,470 But you do pay a price. 774 00:38:52,470 --> 00:38:54,920 So as beautiful as this situation is, and as clever 775 00:38:54,920 --> 00:38:59,180 as it is that we're leveraging these excess resources by slicing up 776 00:38:59,180 --> 00:39:04,190 one server into the illusion of, in this case, three, or more generally more, 777 00:39:04,190 --> 00:39:05,780 there is some overhead. 778 00:39:05,780 --> 00:39:10,670 Because this hypervisor has to be a middleman between your guest operating 779 00:39:10,670 --> 00:39:13,562 systems and your host operating system, the one actually 780 00:39:13,562 --> 00:39:15,020 physically installed on the server. 781 00:39:15,020 --> 00:39:17,600 And any layers of indirection like this, so to speak, 782 00:39:17,600 --> 00:39:19,380 have got to cost you some amount of time. 783 00:39:19,380 --> 00:39:21,500 If there's some work being done here and you only 784 00:39:21,500 --> 00:39:23,780 have a finite number of resources, the hypervisor 785 00:39:23,780 --> 00:39:27,050 itself is surely consuming some of your resources. 786 00:39:27,050 --> 00:39:28,970 And gosh, this just seems really inefficient, 787 00:39:28,970 --> 00:39:32,900 especially if all of your customers are using the same operating system. 788 00:39:32,900 --> 00:39:38,300 My god, why do you have to have copies of the same OS multiply installed? 789 00:39:38,300 --> 00:39:42,920 This just doesn't feel like it's leveraging much economy of scale. 790 00:39:42,920 --> 00:39:46,940 And so it turns out there's a newer technology that's gaining steam, 791 00:39:46,940 --> 00:39:51,260 and this is known not as virtualization, per se, but containerization, 792 00:39:51,260 --> 00:39:54,980 the most popular instance of which is perhaps a company called Docker. 793 00:39:54,980 --> 00:39:57,950 And the world of Docker is a little shorter. 794 00:39:57,950 --> 00:40:00,585 It's a little smarter about how resources are shared. 795 00:40:00,585 --> 00:40:02,960 You still have your infrastructure, your physical server, 796 00:40:02,960 --> 00:40:04,876 and you still have your host operating system, 797 00:40:04,876 --> 00:40:08,100 whether it's Linux or Unix or something like that. 798 00:40:08,100 --> 00:40:11,810 But then instead of a hypervisor, you have the Docker engine, 799 00:40:11,810 --> 00:40:15,680 which is really just an equivalent of that base layer of software. 800 00:40:15,680 --> 00:40:17,930 But notice what's different. 801 00:40:17,930 --> 00:40:22,387 In this case here, we've collapsed the previous picture. 802 00:40:22,387 --> 00:40:25,220 In fact, thanks to our friends at Docker who put this together here, 803 00:40:25,220 --> 00:40:27,800 the guest OS has disappeared. 804 00:40:27,800 --> 00:40:29,940 And you instead have your different applications 805 00:40:29,940 --> 00:40:31,730 and your different binaries and libraries, 806 00:40:31,730 --> 00:40:34,820 as this abbreviation means, all running on the Docker engine. 807 00:40:34,820 --> 00:40:36,180 Now, what does this mean? 808 00:40:36,180 --> 00:40:38,660 This means when running Docker, you typically 809 00:40:38,660 --> 00:40:40,370 choose your operating system-- 810 00:40:40,370 --> 00:40:44,920 for instance, Ubuntu Linux or Debian Linux or something else altogether-- 811 00:40:44,920 --> 00:40:49,400 and then you essentially share that one operating system 812 00:40:49,400 --> 00:40:52,610 across multiple containers. 813 00:40:52,610 --> 00:40:55,100 Instead of virtual machines, we now have containers. 814 00:40:55,100 --> 00:40:58,460 So in other words, you ensure that your different slices all 815 00:40:58,460 --> 00:41:01,850 share some common software-- the kernel, so to speak, 816 00:41:01,850 --> 00:41:05,060 the base core of the operating system. 817 00:41:05,060 --> 00:41:09,080 But then you uniquely layer on top of that base system, 818 00:41:09,080 --> 00:41:13,520 that base set of default files, whatever customizations your customers or you 819 00:41:13,520 --> 00:41:16,880 yourself want, but you share some of the resources. 820 00:41:16,880 --> 00:41:19,700 And long story short, what this means is that containers 821 00:41:19,700 --> 00:41:21,740 tend to be a little lighter weight. 822 00:41:21,740 --> 00:41:25,490 There's less waste of resources because there's less overhead of running them, 823 00:41:25,490 --> 00:41:29,720 which is to say that you can generally start them even more quickly. 824 00:41:29,720 --> 00:41:34,070 And better yet, you can still isolate your different products 825 00:41:34,070 --> 00:41:37,190 and your different services-- database and web server and email 826 00:41:37,190 --> 00:41:38,900 server and any number of other features-- 827 00:41:38,900 --> 00:41:43,340 all within the illusion of their own installation, their own operating 828 00:41:43,340 --> 00:41:47,030 system, even though there are some shared resources here. 829 00:41:47,030 --> 00:41:51,950 So this, too, has been made possible by the capabilities of modern hardware 830 00:41:51,950 --> 00:41:54,590 and the cleverness, frankly, of humans in actually 831 00:41:54,590 --> 00:42:01,470 finding solutions or creative uses for those available resources. 832 00:42:01,470 --> 00:42:05,190 But what other features or topics come into play 833 00:42:05,190 --> 00:42:07,400 in this world of cloud computing? 834 00:42:07,400 --> 00:42:11,051 We've talked about availability and caching and costing, really 835 00:42:11,051 --> 00:42:13,050 figuring out where we're going to actually spend 836 00:42:13,050 --> 00:42:17,430 our money by throwing hardware at problems and scaling more generally. 837 00:42:17,430 --> 00:42:19,830 But there's also issues of replication, which 838 00:42:19,830 --> 00:42:22,890 actually do relate to high availability, so to speak. 839 00:42:22,890 --> 00:42:24,900 But replication refers to duplication of data, 840 00:42:24,900 --> 00:42:27,360 and really backups more generally as a topic. 841 00:42:27,360 --> 00:42:30,570 And then there's also some other funky acronyms that are very much in vogue 842 00:42:30,570 --> 00:42:31,230 these days. 843 00:42:31,230 --> 00:42:33,390 Besides Infrastructure as a Service, there's 844 00:42:33,390 --> 00:42:40,280 also Platform as a Service, PaaS, or Software as a Service, SaaS. 845 00:42:40,280 --> 00:42:43,920 Now, SaaS, even if you've not used it under this name, 846 00:42:43,920 --> 00:42:45,360 odds are you have been using it. 847 00:42:45,360 --> 00:42:50,370 If you do use Gmail or Outlook.com or any web-based email service, 848 00:42:50,370 --> 00:42:52,740 you are using software as a service. 849 00:42:52,740 --> 00:42:55,572 You don't really know, or need to care, where in the world 850 00:42:55,572 --> 00:42:58,530 your emails physically live, or how many servers they're spread across, 851 00:42:58,530 --> 00:43:00,821 or how your data is backed up, or for that matter, when 852 00:43:00,821 --> 00:43:03,960 you click Send, how the email even gets from point A to point B. 853 00:43:03,960 --> 00:43:07,710 You are treating Gmail and Outlook as a software 854 00:43:07,710 --> 00:43:12,690 as a service with all of the underlying implementation details abstracted away. 855 00:43:12,690 --> 00:43:15,870 You just don't know or care how it's implemented-- well, 856 00:43:15,870 --> 00:43:17,760 at least if everything is working. 857 00:43:17,760 --> 00:43:21,090 You probably do care if something goes down. 858 00:43:21,090 --> 00:43:24,870 But there's this intermediate step between this extreme form 859 00:43:24,870 --> 00:43:29,010 of abstraction where all you see is just the top-level service. 860 00:43:29,010 --> 00:43:32,100 And the lowest level implementation that we've 861 00:43:32,100 --> 00:43:34,110 discussed, which is infrastructure as a service, 862 00:43:34,110 --> 00:43:36,690 whereby when using something like Amazon, 863 00:43:36,690 --> 00:43:39,720 you literally click the button that says give me a load balancer. 864 00:43:39,720 --> 00:43:42,214 You literally click a button that says give me two servers. 865 00:43:42,214 --> 00:43:44,130 You literally click a button that says give me 866 00:43:44,130 --> 00:43:46,560 a firewall or any number of other features. 867 00:43:46,560 --> 00:43:49,560 So Amazon and Microsoft and Google, to some extent, 868 00:43:49,560 --> 00:43:52,590 have all implemented these low-level services 869 00:43:52,590 --> 00:43:55,050 that still require that you understand the technology, 870 00:43:55,050 --> 00:43:59,250 and you understand networking, and you understand scaling and availability. 871 00:43:59,250 --> 00:44:03,930 But you so much more easily and inexpensively and efficiently-- 872 00:44:03,930 --> 00:44:08,960 literally with just a laptop or desktop, without any data center of your own-- 873 00:44:08,960 --> 00:44:12,450 stitch together the topology or the architecture that you actually 874 00:44:12,450 --> 00:44:14,640 want, albeit in the cloud. 875 00:44:14,640 --> 00:44:17,820 Platform as a service, though, has arisen as a middle ground 876 00:44:17,820 --> 00:44:20,557 here, whereby you might have services like Herouku, 877 00:44:20,557 --> 00:44:22,890 which you might have heard of, which themselves actually 878 00:44:22,890 --> 00:44:28,950 run on infrastructures like Amazon or Google or Microsoft or the like. 879 00:44:28,950 --> 00:44:32,100 But they provide themselves a layer of abstraction 880 00:44:32,100 --> 00:44:34,650 that isn't quite as high level, so to speak, as what 881 00:44:34,650 --> 00:44:36,510 you get from software as a service. 882 00:44:36,510 --> 00:44:40,500 In fact, these platforms as a service don't provide you with applications. 883 00:44:40,500 --> 00:44:44,350 They just make it easier for you to run your applications in the cloud. 884 00:44:44,350 --> 00:44:45,550 Now, what does that mean? 885 00:44:45,550 --> 00:44:49,740 Well, it's all fun and exciting to understand load balancing 886 00:44:49,740 --> 00:44:52,020 and understand networking and understand the need 887 00:44:52,020 --> 00:44:56,580 for multiple servers and the entire conversation that we've had thus far. 888 00:44:56,580 --> 00:44:59,390 But at the end of the day, if I'm a software developer 889 00:44:59,390 --> 00:45:01,380 or I'm trying to build a business, all I care 890 00:45:01,380 --> 00:45:05,940 about is making my internet application available to real users. 891 00:45:05,940 --> 00:45:09,390 I really don't care about how many servers I have, 892 00:45:09,390 --> 00:45:12,802 how many databases I have, how the load balancers talk to one another. 893 00:45:12,802 --> 00:45:14,760 That's all fine and intellectually interesting. 894 00:45:14,760 --> 00:45:17,230 But I just want to get real work done. 895 00:45:17,230 --> 00:45:19,140 So I'm willing to pay a bit more for this. 896 00:45:19,140 --> 00:45:22,590 I'm willing to pay some middleman, like a Herouku, or any number 897 00:45:22,590 --> 00:45:24,870 of other services, a platform as a service, 898 00:45:24,870 --> 00:45:27,450 to abstract away those kinds of details. 899 00:45:27,450 --> 00:45:30,180 So I have the wherewithal, and I have the willingness 900 00:45:30,180 --> 00:45:33,740 to actually say host this as a web server. 901 00:45:33,740 --> 00:45:34,890 So give me a web server. 902 00:45:34,890 --> 00:45:37,920 I will pay you some number of dollars per month to give me a web server. 903 00:45:37,920 --> 00:45:41,310 But I want you, Herouku, to deal with the auto scaling of it. 904 00:45:41,310 --> 00:45:43,330 I don't care how many servers it is. 905 00:45:43,330 --> 00:45:44,910 I don't care how they are connected. 906 00:45:44,910 --> 00:45:46,784 I don't care anything about these heartbeats. 907 00:45:46,784 --> 00:45:49,880 I just want to have the illusion, for my own sake, 908 00:45:49,880 --> 00:45:53,880 of just one server that somehow grows or shrinks 909 00:45:53,880 --> 00:45:56,400 dynamically to handle my customer base. 910 00:45:56,400 --> 00:45:59,381 Meanwhile, things like load balancing, I just want my customers 911 00:45:59,381 --> 00:46:00,630 to be able to reach my server. 912 00:46:00,630 --> 00:46:02,046 I don't care how it's implemented. 913 00:46:02,046 --> 00:46:04,470 I don't care how it's made to be highly available. 914 00:46:04,470 --> 00:46:06,120 I just want that to work. 915 00:46:06,120 --> 00:46:10,050 And so companies like Herouku provide these platforms 916 00:46:10,050 --> 00:46:12,924 as a service that just make your life a little bit easier. 917 00:46:12,924 --> 00:46:15,840 And you don't have to think about or know about or worry about as many 918 00:46:15,840 --> 00:46:16,560 of these details. 919 00:46:16,560 --> 00:46:18,140 Now, to be fair, if something breaks, you 920 00:46:18,140 --> 00:46:20,140 might not understand exactly what's going wrong, 921 00:46:20,140 --> 00:46:22,200 and you yourself might not be able to solve it. 922 00:46:22,200 --> 00:46:27,810 Indeed, you might be entirely at the mercy of the cloud provider, or the PAS 923 00:46:27,810 --> 00:46:30,150 provider, to solve the problem for you. 924 00:46:30,150 --> 00:46:32,280 But you're saving time. 925 00:46:32,280 --> 00:46:34,132 You're saving energy elsewhere by not having 926 00:46:34,132 --> 00:46:36,840 to worry about those lower-level implementation details, at least 927 00:46:36,840 --> 00:46:37,631 in the common case. 928 00:46:37,631 --> 00:46:42,030 But odds are you're paying a little more to Herouku than you would to an Amazon 929 00:46:42,030 --> 00:46:46,480 directly because they're providing you with this value-added service. 930 00:46:46,480 --> 00:46:48,900 So as cryptic as these acronyms really mean, 931 00:46:48,900 --> 00:46:51,210 they're really just referring to disparate levels 932 00:46:51,210 --> 00:46:54,300 of abstraction, all of which somehow relate to the cloud. 933 00:46:54,300 --> 00:46:56,790 But infrastructure as a service is a virtualization 934 00:46:56,790 --> 00:46:59,880 of these hardware ideas, the physical cabling 935 00:46:59,880 --> 00:47:01,770 that we drew here on the screen. 936 00:47:01,770 --> 00:47:04,080 Software as a service really is just that application 937 00:47:04,080 --> 00:47:05,288 that the user interacts with. 938 00:47:05,288 --> 00:47:09,090 And platform as a service is an intermediate step, 939 00:47:09,090 --> 00:47:12,180 whereby you, in building your software in the cloud, 940 00:47:12,180 --> 00:47:16,710 can worry a little bit about how to actually make it available to users. 941 00:47:16,710 --> 00:47:19,320 But let's consider one other challenge now-- 942 00:47:19,320 --> 00:47:23,310 that of database replication since, of course, thus far, 943 00:47:23,310 --> 00:47:26,340 we've been talking about a web server as though it's the entire picture. 944 00:47:26,340 --> 00:47:28,560 But the reality is most any business that 945 00:47:28,560 --> 00:47:31,002 has a web-based presence or a mobile presence 946 00:47:31,002 --> 00:47:32,460 is going to be storing information. 947 00:47:32,460 --> 00:47:35,249 When users register, when users check something out, 948 00:47:35,249 --> 00:47:38,040 add something to their shopping cart, so to speak, all of that data 949 00:47:38,040 --> 00:47:40,560 needs to somehow be stored. 950 00:47:40,560 --> 00:47:44,830 So let's consider now what the world really likely looks like. 951 00:47:44,830 --> 00:47:47,070 So here is my laptop again. 952 00:47:47,070 --> 00:47:51,480 And here is the cloud that's between me and some service 953 00:47:51,480 --> 00:47:52,860 that I'm interested in. 954 00:47:52,860 --> 00:47:55,695 We'll assume for now that there is some kind of load balancing. 955 00:47:55,695 --> 00:47:57,570 And I'm just going to draw it a little bigger 956 00:47:57,570 --> 00:48:00,957 this time to suggest that-- let's just think of it now as a black box. 957 00:48:00,957 --> 00:48:02,040 And maybe it's one server. 958 00:48:02,040 --> 00:48:02,970 Maybe it's two. 959 00:48:02,970 --> 00:48:03,930 Maybe it's more. 960 00:48:03,930 --> 00:48:07,200 But somehow or other, load balancing is implemented. 961 00:48:07,200 --> 00:48:09,990 Then I'm going to have all of my servers here, 962 00:48:09,990 --> 00:48:14,790 which we'll abstract away as maybe three or more at this point-- one, two, 963 00:48:14,790 --> 00:48:16,770 and then we'll call this n. 964 00:48:16,770 --> 00:48:20,200 But a web server typically does not do everything these days. 965 00:48:20,200 --> 00:48:22,530 In fact, it's been trending for some time 966 00:48:22,530 --> 00:48:25,440 to actually have different servers or different virtual machines, 967 00:48:25,440 --> 00:48:27,960 or even more recently, different containers. 968 00:48:27,960 --> 00:48:30,180 Each provide individual services. 969 00:48:30,180 --> 00:48:32,440 Sometimes people call these micro services 970 00:48:32,440 --> 00:48:36,030 if a container only does one, and one very narrowly defined thing, 971 00:48:36,030 --> 00:48:39,540 like send emails, or save information to a database, 972 00:48:39,540 --> 00:48:42,610 or respond to HTTP requests. 973 00:48:42,610 --> 00:48:46,830 So these back end web servers are not the only types of servers we have. 974 00:48:46,830 --> 00:48:49,780 Odds are we at least have one database. 975 00:48:49,780 --> 00:48:53,160 So let's consider now the implication of all 976 00:48:53,160 --> 00:48:56,400 of these architectural decisions we've made thus far 977 00:48:56,400 --> 00:48:59,400 on how we actually store our data. 978 00:48:59,400 --> 00:49:03,780 So in simplest form, our database might look like this. 979 00:49:03,780 --> 00:49:06,510 And for historical reasons, it's generally drawn as a cylinder. 980 00:49:06,510 --> 00:49:08,580 And this is our database. 981 00:49:08,580 --> 00:49:12,210 Now, it's immediately obvious that if all servers-- 982 00:49:12,210 --> 00:49:16,620 1, 2, dot, dot, dot, n-- need to save information or read information 983 00:49:16,620 --> 00:49:20,800 from a database, they've all got to somehow communicate with that database 984 00:49:20,800 --> 00:49:25,300 so they all have some kind of connectivity, physically or otherwise. 985 00:49:25,300 --> 00:49:29,850 So this seems fine so long as the software that's running on servers 1, 986 00:49:29,850 --> 00:49:32,680 2, dot, dot, dot, and no matter what language we're using, 987 00:49:32,680 --> 00:49:36,330 whether it's Java or Python or PHP or C# or something else-- 988 00:49:36,330 --> 00:49:40,740 so long as those servers can talk to, via the network, this database, 989 00:49:40,740 --> 00:49:41,880 that's great. 990 00:49:41,880 --> 00:49:44,062 They can all save their data to the same place, 991 00:49:44,062 --> 00:49:46,270 and they can all read their data from the same place. 992 00:49:46,270 --> 00:49:48,360 So everything stays nicely in sync. 993 00:49:48,360 --> 00:49:51,030 But what's the first problem that motivated the entirety 994 00:49:51,030 --> 00:49:54,680 of this discussion from the outset? 995 00:49:54,680 --> 00:49:59,440 Well, what if one database isn't really enough? 996 00:49:59,440 --> 00:50:03,310 Well, we could take the approach of vertically scaling 997 00:50:03,310 --> 00:50:07,210 our architecture, which is another piece of jargon in this space. 998 00:50:07,210 --> 00:50:14,530 So vertical scaling means if your one database isn't quite up to snuff, 999 00:50:14,530 --> 00:50:18,670 and you're running low on disk space or capacity because of numbers 1000 00:50:18,670 --> 00:50:22,360 of requests per second are, of course, limited, you know what you can do? 1001 00:50:22,360 --> 00:50:30,310 You can go ahead and disconnect this one and go ahead and put in a bigger one, 1002 00:50:30,310 --> 00:50:33,130 and therefore increase your capacity. 1003 00:50:33,130 --> 00:50:36,640 And vertical scaling means to really pay more money 1004 00:50:36,640 --> 00:50:39,640 or get something higher end, a higher, more premium 1005 00:50:39,640 --> 00:50:42,730 model, a more expensive model that's got more disk space and more RAM 1006 00:50:42,730 --> 00:50:44,750 and a faster CPU or more CPUs. 1007 00:50:44,750 --> 00:50:46,720 So you just throw hardware at the problem-- 1008 00:50:46,720 --> 00:50:50,680 not in the sense of multiple servers, but just one bigger and better server. 1009 00:50:50,680 --> 00:50:52,117 But what are the challenges here? 1010 00:50:52,117 --> 00:50:53,950 Well, if you've ever bought a home computer, 1011 00:50:53,950 --> 00:50:57,220 odds are whether it's been on Dell's site or Microsoft's or Apple's 1012 00:50:57,220 --> 00:51:00,310 or the like, you often have this good, better, best thing 1013 00:51:00,310 --> 00:51:04,090 where, for the top of the line laptop or desktop, 1014 00:51:04,090 --> 00:51:06,640 you're going to be paying through the roof-- 1015 00:51:06,640 --> 00:51:08,620 through the nose, so to speak. 1016 00:51:08,620 --> 00:51:11,470 You're going to be paying a premium for that top of the line model. 1017 00:51:11,470 --> 00:51:14,178 But you might actually be able to save a decent number of dollars 1018 00:51:14,178 --> 00:51:17,470 by going for the second best or the third best, 1019 00:51:17,470 --> 00:51:21,340 because the marginal gains of each additional dollar 1020 00:51:21,340 --> 00:51:22,695 really aren't all that much. 1021 00:51:22,695 --> 00:51:24,820 Because for marketing reasons, they know that there 1022 00:51:24,820 --> 00:51:26,778 might be some people out there that will always 1023 00:51:26,778 --> 00:51:28,330 pay top dollar for the fastest one. 1024 00:51:28,330 --> 00:51:30,163 But just because you're paying twice as much 1025 00:51:30,163 --> 00:51:33,650 doesn't mean the laptops is going to be twice as good, for instance. 1026 00:51:33,650 --> 00:51:37,120 So this is to say to vertically scale your database, you might end up 1027 00:51:37,120 --> 00:51:40,810 paying, through the nose, some very expensive hardware just 1028 00:51:40,810 --> 00:51:43,820 to eke out some more performance. 1029 00:51:43,820 --> 00:51:45,610 But that's not even the biggest problem. 1030 00:51:45,610 --> 00:51:48,350 The most fundamental problem is at the end of the day, 1031 00:51:48,350 --> 00:51:53,050 there is a top-of-the-line server for your database that only can support 1032 00:51:53,050 --> 00:51:56,020 a finite number of database connections at a time, 1033 00:51:56,020 --> 00:51:58,480 or a finite number of reads or writes, so to speak, 1034 00:51:58,480 --> 00:52:00,357 saving and reading from the database. 1035 00:52:00,357 --> 00:52:03,190 So at some point or other, it doesn't matter how much money you have 1036 00:52:03,190 --> 00:52:05,530 or how willing you are to throw hardware at the problem. 1037 00:52:05,530 --> 00:52:10,070 There exists no server that can handle more users than you currently have. 1038 00:52:10,070 --> 00:52:14,320 So at some point, you actually have to put away your wallet 1039 00:52:14,320 --> 00:52:17,680 and put back on the engineering hat alone and figure out 1040 00:52:17,680 --> 00:52:24,220 how to not vertically scale, but horizontally scale your architecture. 1041 00:52:24,220 --> 00:52:29,860 And by this, I mean actually introducing not just one big, fancy server, 1042 00:52:29,860 --> 00:52:33,112 but two or more maybe smaller, cheaper servers. 1043 00:52:33,112 --> 00:52:35,320 In fact, one of the things that companies like Google 1044 00:52:35,320 --> 00:52:37,810 were especially good at early on was using 1045 00:52:37,810 --> 00:52:42,910 off-the-shelf, inexpensive hardware and building supercomputers out of them, 1046 00:52:42,910 --> 00:52:44,770 but much more economically than they might 1047 00:52:44,770 --> 00:52:46,450 have had they gone top of the line everywhere, 1048 00:52:46,450 --> 00:52:48,200 even though that would mean fewer servers. 1049 00:52:48,200 --> 00:52:50,696 Better to get more cheaper servers and somehow 1050 00:52:50,696 --> 00:52:53,320 figure out how to interconnect them and write the software that 1051 00:52:53,320 --> 00:52:56,410 lets them all be useful simultaneously so that we can instead 1052 00:52:56,410 --> 00:52:59,620 have a picture that looks a bit more like this, with maybe 1053 00:52:59,620 --> 00:53:03,310 a pair of databases in the picture now. 1054 00:53:03,310 --> 00:53:05,650 Of course, we've now created that same problem 1055 00:53:05,650 --> 00:53:09,130 that we had earlier about where does the data go. 1056 00:53:09,130 --> 00:53:10,990 Where does the traffic or the users flow, 1057 00:53:10,990 --> 00:53:14,960 especially now where we have one on the left and one on the right? 1058 00:53:14,960 --> 00:53:18,460 So there's a couple of solutions here, but there are some different problems 1059 00:53:18,460 --> 00:53:19,900 that arise with databases. 1060 00:53:19,900 --> 00:53:27,600 If we very simply put a load balancer in here, LB, and route traffic uniformly-- 1061 00:53:27,600 --> 00:53:30,010 say, to the left or to the right-- 1062 00:53:30,010 --> 00:53:32,180 that's probably not the best thing. 1063 00:53:32,180 --> 00:53:34,960 Because then you're going to end up with a world where you're 1064 00:53:34,960 --> 00:53:39,460 saving some data for a user here and some data for a user 1065 00:53:39,460 --> 00:53:42,757 here just by chance, because you're using round robin, so to speak, 1066 00:53:42,757 --> 00:53:45,340 or just some probabilistic heuristic where some of the traffic 1067 00:53:45,340 --> 00:53:47,140 goes this way, some of the traffic goes that way. 1068 00:53:47,140 --> 00:53:48,160 And that's not so good. 1069 00:53:48,160 --> 00:53:48,670 OK. 1070 00:53:48,670 --> 00:53:54,820 But we could solve that by somehow making sure that if this user, User A, 1071 00:53:54,820 --> 00:54:00,400 visits my web site, I should always send him or her to the same database. 1072 00:54:00,400 --> 00:54:02,440 And you can do this in a couple of ways. 1073 00:54:02,440 --> 00:54:04,510 You can enforce some notion of stickiness, 1074 00:54:04,510 --> 00:54:07,180 so to speak, whereby you somehow notice that, oh, this is 1075 00:54:07,180 --> 00:54:09,010 User A. We've seen him or her before. 1076 00:54:09,010 --> 00:54:12,130 Let's make sure we send him to this database on the left 1077 00:54:12,130 --> 00:54:14,020 and not the one on the right. 1078 00:54:14,020 --> 00:54:18,070 Or you can more formally use a process known as sharding. 1079 00:54:18,070 --> 00:54:20,380 In fact, this is very common early on in databases, 1080 00:54:20,380 --> 00:54:24,010 and even in websites like Facebook, where you have so many users 1081 00:54:24,010 --> 00:54:26,830 that you need to start splitting them across multiple databases. 1082 00:54:26,830 --> 00:54:28,360 But gosh, how to do that? 1083 00:54:28,360 --> 00:54:31,300 Back in the earliest days of Facebook, what they might have done 1084 00:54:31,300 --> 00:54:35,275 was put all Harvard users on one database, all MIT users on another, 1085 00:54:35,275 --> 00:54:37,600 all BU users on another, and so forth. 1086 00:54:37,600 --> 00:54:40,210 Because Facebook, as you may recall, started scaling out 1087 00:54:40,210 --> 00:54:41,620 initially to disparate schools. 1088 00:54:41,620 --> 00:54:44,410 That was a wonderful opportunity to shard 1089 00:54:44,410 --> 00:54:49,930 their data by putting similar users in their respective databases. 1090 00:54:49,930 --> 00:54:51,700 And at the time, I think you couldn't even 1091 00:54:51,700 --> 00:54:54,240 be friends with people in other schools, at least very early 1092 00:54:54,240 --> 00:54:58,050 on, because those databases, presumably, were independent, 1093 00:54:58,050 --> 00:55:01,440 or certainly could have been topologicaly. 1094 00:55:01,440 --> 00:55:04,530 Or you might do something more simple that doesn't create 1095 00:55:04,530 --> 00:55:06,420 some problems like isolation there. 1096 00:55:06,420 --> 00:55:10,320 Maybe all of your users whose last name start with A go on one server, 1097 00:55:10,320 --> 00:55:12,400 and all of your users whose names start with B 1098 00:55:12,400 --> 00:55:14,170 go on another server, and so forth. 1099 00:55:14,170 --> 00:55:18,330 So you can almost hash your users, to borrow a terminology from hash tables, 1100 00:55:18,330 --> 00:55:20,970 and decide where to put that data. 1101 00:55:20,970 --> 00:55:24,690 Of course, that does not help with backups or redundancy. 1102 00:55:24,690 --> 00:55:28,470 Because if you're putting all of your A names here and all of your B names 1103 00:55:28,470 --> 00:55:31,230 here, what happens, god forbid, if one of the servers goes down? 1104 00:55:31,230 --> 00:55:33,600 You've lost half of your customers. 1105 00:55:33,600 --> 00:55:36,390 So it would seem that no matter how you balance the load, 1106 00:55:36,390 --> 00:55:39,850 you really want to maintain duplicates of data. 1107 00:55:39,850 --> 00:55:42,870 And so there's a few different ways people solve this. 1108 00:55:42,870 --> 00:55:45,510 In fact, let me go ahead and temporarily go 1109 00:55:45,510 --> 00:55:50,940 back to that first model, where we had a really fancy, bigger database 1110 00:55:50,940 --> 00:55:53,670 that I'll deliberately draw as pretty big. 1111 00:55:53,670 --> 00:55:57,450 And this is big in the sense that it can respond to requests quickly 1112 00:55:57,450 --> 00:55:59,280 and it can store a lot of data. 1113 00:55:59,280 --> 00:56:03,630 This might be generally called our primary or our master database. 1114 00:56:03,630 --> 00:56:06,420 And it's where our data goes to live long term. 1115 00:56:06,420 --> 00:56:09,900 It's where data is written to, so to speak, and could also be read from. 1116 00:56:09,900 --> 00:56:13,380 But if we're going to bump up against some limit of how much work 1117 00:56:13,380 --> 00:56:15,450 this database can do at once, it would be 1118 00:56:15,450 --> 00:56:18,960 nice to have some secondary servers or tertiary servers. 1119 00:56:18,960 --> 00:56:24,240 So a very common paradigm would be to use this primary database for writes-- 1120 00:56:24,240 --> 00:56:25,860 we'll abbreviate it w-- 1121 00:56:25,860 --> 00:56:29,790 and then also have maybe a couple of smaller databases, or even 1122 00:56:29,790 --> 00:56:35,610 the same size databases, that are meant for reads, abbreviated R. 1123 00:56:35,610 --> 00:56:39,600 And so long as these databases are somehow talking to one another, 1124 00:56:39,600 --> 00:56:41,400 this topology will just work. 1125 00:56:41,400 --> 00:56:43,410 This is a feature known as replication. 1126 00:56:43,410 --> 00:56:46,650 So long as the databases are configured in such a way 1127 00:56:46,650 --> 00:56:50,310 that any time data is written to the primary database or the master 1128 00:56:50,310 --> 00:56:55,440 database, that data gets replicated to any replicas, as they're called. 1129 00:56:55,440 --> 00:57:02,539 Meanwhile, servers 1, 2, and n should also be able to talk to these replicas. 1130 00:57:02,539 --> 00:57:05,580 And if your code is smart enough-- and you would have to think about this 1131 00:57:05,580 --> 00:57:10,260 and design this into your codebase-- you could ensure that any time you 1132 00:57:10,260 --> 00:57:15,270 read data from a database, it comes from one, or really any, of your replicas, 1133 00:57:15,270 --> 00:57:18,240 replicas in the sense that they are meant to have duplicate data. 1134 00:57:18,240 --> 00:57:22,170 But anytime you write data-- a SQL INSERT or UPDATE or DELETE, 1135 00:57:22,170 --> 00:57:24,150 as opposed to a SQL SELECT-- 1136 00:57:24,150 --> 00:57:28,710 you only send your write operations to the primary or master database 1137 00:57:28,710 --> 00:57:32,026 and leave it to it to then replicate it to the read replicas. 1138 00:57:32,026 --> 00:57:33,900 Now, of course, there are some problems here. 1139 00:57:33,900 --> 00:57:35,070 There's some latency, potentially. 1140 00:57:35,070 --> 00:57:36,320 Maybe it takes a split second. 1141 00:57:36,320 --> 00:57:39,190 Maybe it takes a couple seconds for that data to replicate. 1142 00:57:39,190 --> 00:57:43,440 So things might not appear to be updated instantaneously. 1143 00:57:43,440 --> 00:57:48,900 But you have now a very scalable model in that if you have the money to spend, 1144 00:57:48,900 --> 00:57:54,000 you can even have more read replicas and have even more and more read capacity. 1145 00:57:54,000 --> 00:57:58,080 Of course, you're going to eventually bump up against a limit on your rights, 1146 00:57:58,080 --> 00:58:00,840 at which point we need to introduce another solution. 1147 00:58:00,840 --> 00:58:03,690 But again, this is a very incremental approach. 1148 00:58:03,690 --> 00:58:06,890 And we can throw a little bit of money at the problem each time 1149 00:58:06,890 --> 00:58:09,090 and a little bit of engineering wherewithal 1150 00:58:09,090 --> 00:58:11,491 in order to at least get us over that next ledge, which 1151 00:58:11,491 --> 00:58:14,490 is super important, certainly, when you're first building your business. 1152 00:58:14,490 --> 00:58:17,582 If You don't necessarily have the resources to go all in on things, 1153 00:58:17,582 --> 00:58:19,290 you at least want to get over this hurdle 1154 00:58:19,290 --> 00:58:24,340 or at least build in some capacity for the next load of users. 1155 00:58:24,340 --> 00:58:27,300 So what if we run out of capacity, though, 1156 00:58:27,300 --> 00:58:31,090 with that that writable server, the master database, so to speak? 1157 00:58:31,090 --> 00:58:33,270 We need to be a little more clever. 1158 00:58:33,270 --> 00:58:37,860 And it turns out we can borrow this idea of these horizontal arrows 1159 00:58:37,860 --> 00:58:43,110 here to replicate our data, but for a slightly different purpose. 1160 00:58:43,110 --> 00:58:47,400 We could still have a pretty souped up writable database. 1161 00:58:47,400 --> 00:58:51,750 But we could have another one, maybe identical in its specs, writable. 1162 00:58:51,750 --> 00:58:54,960 But somehow, these things need to be able to synchronize with themselves. 1163 00:58:54,960 --> 00:58:57,920 And maybe there's still some read replicas over here-- 1164 00:58:57,920 --> 00:59:01,200 R for read, and another one over here, R for read. 1165 00:59:01,200 --> 00:59:03,750 And these are all somehow interconnected as well. 1166 00:59:03,750 --> 00:59:07,860 But you can have what's called master master replication, whereby 1167 00:59:07,860 --> 00:59:12,144 your server's code writes to one of these servers. 1168 00:59:12,144 --> 00:59:13,560 And maybe it's either of them now. 1169 00:59:13,560 --> 00:59:15,840 Maybe the load balancer actually does send some of the writes 1170 00:59:15,840 --> 00:59:17,423 this way, some of the writes this way. 1171 00:59:17,423 --> 00:59:20,190 But the master database, the writable ones now, 1172 00:59:20,190 --> 00:59:24,534 are configured, in software, to replicate horizontally, so to speak. 1173 00:59:24,534 --> 00:59:26,700 So here too, you might have a little bit of latency. 1174 00:59:26,700 --> 00:59:28,491 It might take a few milliseconds or seconds 1175 00:59:28,491 --> 00:59:30,060 for the data to actually replicate. 1176 00:59:30,060 --> 00:59:34,800 But at least now we've doubled the capacity for our writes 1177 00:59:34,800 --> 00:59:38,560 so as to handle twice as many writable operations. 1178 00:59:38,560 --> 00:59:41,790 And we can continue to hang more and more read replicas off of these 1179 00:59:41,790 --> 00:59:46,740 if you want in order to handle more and more users. 1180 00:59:46,740 --> 00:59:50,950 And so this is the challenge and, dare say, the fun of engineering 1181 00:59:50,950 --> 00:59:54,192 architecturally-- understanding some of these basic building blocks. 1182 00:59:54,192 --> 00:59:56,650 And even if you might not know the particular manufacturers 1183 00:59:56,650 --> 00:59:59,140 or how you physically configure the servers, 1184 00:59:59,140 --> 01:00:02,300 or how in software you configure these servers, at the end of the day, 1185 01:00:02,300 --> 01:00:05,950 these really are just puzzle pieces that can somehow be interlocked. 1186 01:00:05,950 --> 01:00:09,280 And these puzzle pieces can be used to solve more and more interesting 1187 01:00:09,280 --> 01:00:10,090 problems. 1188 01:00:10,090 --> 01:00:15,760 But to our discussion PaaS and Software as a Service and Infrastructure 1189 01:00:15,760 --> 01:00:19,490 as a Service, there's also these different layers of abstraction. 1190 01:00:19,490 --> 01:00:22,810 And so thematic throughout this in all of our discussions 1191 01:00:22,810 --> 01:00:23,850 has been this layering. 1192 01:00:23,850 --> 01:00:27,802 Indeed, we started, really, down here with those zeros and ones and bits, 1193 01:00:27,802 --> 01:00:30,010 and very quickly went to Ascii, and very quickly went 1194 01:00:30,010 --> 01:00:33,010 to colors and images and videos and so forth. 1195 01:00:33,010 --> 01:00:36,042 Because once you understand some of those ingredients or puzzle pieces, 1196 01:00:36,042 --> 01:00:37,750 can you build something more interesting? 1197 01:00:37,750 --> 01:00:39,790 And then can you slap a name on it-- 1198 01:00:39,790 --> 01:00:43,392 sometimes cryptic, like IaaS, or PaaS, or SaaS? 1199 01:00:43,392 --> 01:00:45,100 But at the end of the day, those are just 1200 01:00:45,100 --> 01:00:48,580 labels that describe, really, black boxes, inside of which 1201 01:00:48,580 --> 01:00:52,360 is a decent amount of complexity, a clever amount of engineering, 1202 01:00:52,360 --> 01:00:55,270 but ultimately, a solution to a problem. 1203 01:00:55,270 --> 01:00:58,810 And so in cloud computing, do we really have this catch-all phrase that's 1204 01:00:58,810 --> 01:01:03,430 referring to a whole class of solutions to problems that ultimately are all 1205 01:01:03,430 --> 01:01:07,090 about getting one's business or getting one's personal website 1206 01:01:07,090 --> 01:01:11,140 out on the internet for users to access, whether via laptops or desktops 1207 01:01:11,140 --> 01:01:12,820 or mobile devices and more? 1208 01:01:12,820 --> 01:01:14,710 So at the end of the day, what is the cloud? 1209 01:01:14,710 --> 01:01:16,240 It's this evolving definition. 1210 01:01:16,240 --> 01:01:21,230 It's this evolving class of services that just continues to grow. 1211 01:01:21,230 --> 01:01:23,840 But each of those services is solving a problem. 1212 01:01:23,840 --> 01:01:30,880 Each of those problems derives from plugging one hole in a leaky hose, 1213 01:01:30,880 --> 01:01:33,580 seeing another one spring up, and then addressing that one, 1214 01:01:33,580 --> 01:01:36,430 and then layering on top of those solutions these are abstractions, 1215 01:01:36,430 --> 01:01:39,700 and ultimately some marketing speak, like cloud computing itself, 1216 01:01:39,700 --> 01:01:43,120 so that you can build, out of these more sophisticated puzzle pieces, 1217 01:01:43,120 --> 01:01:45,640 bigger and better solutions to actual problems 1218 01:01:45,640 --> 01:01:49,860 you have when you're trying to build your own site. 1219 01:01:49,860 --> 01:01:50,896