1
00:00:00,000 --> 00:00:10,360


2
00:00:10,360 --> 00:00:12,340
Cloud computing-- it's
this term that rather

3
00:00:12,340 --> 00:00:14,330
swept onto the scene in recent years.

4
00:00:14,330 --> 00:00:17,230
And it sounds like it's some
new and trendy technology.

5
00:00:17,230 --> 00:00:19,769
But in reality, it's really
just a very nice packaging

6
00:00:19,769 --> 00:00:22,060
up of a whole number of
technologies that have actually

7
00:00:22,060 --> 00:00:23,620
been with us for some time.

8
00:00:23,620 --> 00:00:26,710
In fact, cloud computing,
in its simplest form,

9
00:00:26,710 --> 00:00:29,800
can really be thought
of as just outsourcing

10
00:00:29,800 --> 00:00:33,250
the hosting of your applications
and really outsourcing

11
00:00:33,250 --> 00:00:36,790
the hosting of your physical servers
to someone else-- put another way,

12
00:00:36,790 --> 00:00:41,140
renting space and renting time
on someone else's computers.

13
00:00:41,140 --> 00:00:45,520
But these days, we just have so much
computational capabilities-- that is,

14
00:00:45,520 --> 00:00:50,650
our computers are so fast, our CPUs
are so many, and we have so much RAM--

15
00:00:50,650 --> 00:00:53,500
that new and fancier
technologies have lent themselves

16
00:00:53,500 --> 00:00:56,590
to this trend of hosting
all the more software

17
00:00:56,590 --> 00:01:00,520
and putting all of the more hardware
off-site in the so-called cloud

18
00:01:00,520 --> 00:01:05,170
so that companies, both big
and small, no longer need

19
00:01:05,170 --> 00:01:09,010
to host their own physical hardware
or even a whole number of roles

20
00:01:09,010 --> 00:01:10,390
in their own local companies.

21
00:01:10,390 --> 00:01:13,120
And so what we'll do now is
dive into cloud computing,

22
00:01:13,120 --> 00:01:15,280
look at some of the
problems it solves, look

23
00:01:15,280 --> 00:01:18,190
at some of the opportunities
it affords, but ultimately,

24
00:01:18,190 --> 00:01:20,890
take a look from the ground up
at what's underneath the hood

25
00:01:20,890 --> 00:01:23,080
here so that by the end
of this, we have a better

26
00:01:23,080 --> 00:01:25,900
understanding of what the
cloud is, why it is useful,

27
00:01:25,900 --> 00:01:28,030
and what it actually is not.

28
00:01:28,030 --> 00:01:31,490
So with that said, let's
start with a simple scenario.

29
00:01:31,490 --> 00:01:34,990
Of course, the cloud
perhaps derives its origins

30
00:01:34,990 --> 00:01:37,630
from how the internet,
for some time, was drawn,

31
00:01:37,630 --> 00:01:40,510
which was just this big, nebulous
cloud, in that it doesn't really

32
00:01:40,510 --> 00:01:41,950
matter what's inside that cloud.

33
00:01:41,950 --> 00:01:46,330
Although at this point, you most surely
appreciate that inside of this cloud

34
00:01:46,330 --> 00:01:49,450
are things like routers, and
running through those routers

35
00:01:49,450 --> 00:01:51,937
are packets, both TCP/IP and the like.

36
00:01:51,937 --> 00:01:53,770
And underneath the hood,
then, of this cloud

37
00:01:53,770 --> 00:01:57,520
is some transport mechanism that
gets data from point A to point B.

38
00:01:57,520 --> 00:02:00,550
So what might those point
A's and Point B's be?

39
00:02:00,550 --> 00:02:04,180
Well, if this here is my little,
old laptop, connected somehow

40
00:02:04,180 --> 00:02:07,510
to the internet here,
and maybe down here there

41
00:02:07,510 --> 00:02:11,500
is some web server on which lives
a whole bunch of web pages--

42
00:02:11,500 --> 00:02:12,550
maybe it's my email.

43
00:02:12,550 --> 00:02:13,840
Maybe it's the day's news.

44
00:02:13,840 --> 00:02:16,240
Maybe it's some social
media site or the like.

45
00:02:16,240 --> 00:02:21,190
I, at point A, want to somehow
connect to point B down here.

46
00:02:21,190 --> 00:02:24,790
Now, it turns out it's not all
that hard to get a website up

47
00:02:24,790 --> 00:02:26,230
and running on the internet.

48
00:02:26,230 --> 00:02:28,540
You can, of course, use
any number of languages.

49
00:02:28,540 --> 00:02:30,760
You can use any number of databases.

50
00:02:30,760 --> 00:02:34,270
And you can do it with
relatively little experience,

51
00:02:34,270 --> 00:02:36,320
just getting something on the internet.

52
00:02:36,320 --> 00:02:39,250
In fact, it's not all that
hard, relatively speaking,

53
00:02:39,250 --> 00:02:41,740
to get a prototype of
your application or even

54
00:02:41,740 --> 00:02:44,590
your first version of your
business up and running.

55
00:02:44,590 --> 00:02:50,200
But things start to get hard quickly,
especially if you have some success.

56
00:02:50,200 --> 00:02:53,320
Indeed, a good problem to have is
that you have so many customers and so

57
00:02:53,320 --> 00:02:57,100
many users hitting your
websites that you can't actually

58
00:02:57,100 --> 00:02:58,450
handle all of the load.

59
00:02:58,450 --> 00:03:01,365
Now, it's a good problem in the
sense that business is booming.

60
00:03:01,365 --> 00:03:03,490
But it's, of course, an
actual problem in the sense

61
00:03:03,490 --> 00:03:06,160
that your customers aren't going
to be able to visit your web site

62
00:03:06,160 --> 00:03:08,110
and buy whatever it is
you're selling or read

63
00:03:08,110 --> 00:03:12,520
whatever it is you're posting if your
servers can't actually handle the load.

64
00:03:12,520 --> 00:03:16,780
And by load, I simply mean the number
of users per minute or per unit of time

65
00:03:16,780 --> 00:03:19,390
that your website is
actually experiencing.

66
00:03:19,390 --> 00:03:21,670
And its capacity, meanwhile,
would be the number

67
00:03:21,670 --> 00:03:23,500
of users it can actually support.

68
00:03:23,500 --> 00:03:25,760
Now, why are there these
limits in the first place?

69
00:03:25,760 --> 00:03:28,030
Well, you may recall
that inside of a computer

70
00:03:28,030 --> 00:03:30,640
is a CPU, the brains of that computer.

71
00:03:30,640 --> 00:03:33,580
And inside of a computer
is some memory, like RAM.

72
00:03:33,580 --> 00:03:36,940
And there might be some longer-term
storage, like hard disk space.

73
00:03:36,940 --> 00:03:41,320
At the end of the day, all of those
resources and more are finite.

74
00:03:41,320 --> 00:03:44,290
You can only fit so much
physical hardware in a computer.

75
00:03:44,290 --> 00:03:47,500
Humans have only been able
to pack so many resources

76
00:03:47,500 --> 00:03:49,894
into the physical space of a computer.

77
00:03:49,894 --> 00:03:51,310
And then, of course, there's cost.

78
00:03:51,310 --> 00:03:54,920
You might be able to only afford
so much computing capacity.

79
00:03:54,920 --> 00:03:58,800
So if a computer can only do
some number of things per second,

80
00:03:58,800 --> 00:04:02,154
there is surely an upper bound on
how many people can visit your web

81
00:04:02,154 --> 00:04:05,320
site, how many people can add things
to their shopping cart, how many people

82
00:04:05,320 --> 00:04:07,330
can check out with their credit card.

83
00:04:07,330 --> 00:04:10,940
Because you only have, at the end of
the day, a finite numbers of resources.

84
00:04:10,940 --> 00:04:13,060
Now, what does that mean in real terms?

85
00:04:13,060 --> 00:04:16,540
Well, maybe your web server can
handle 100 users per minute.

86
00:04:16,540 --> 00:04:18,700
Maybe it can handle
1,000 users per minute.

87
00:04:18,700 --> 00:04:22,240
Maybe it can handle 1,000 users per
second, or even much more than that.

88
00:04:22,240 --> 00:04:26,887
It really depends on the specifications
of your hardware-- how much RAM,

89
00:04:26,887 --> 00:04:29,470
how much CPU and so forth that
you actually have-- and it also

90
00:04:29,470 --> 00:04:33,430
depends, to some extent, on how
well-written your code is and how fast

91
00:04:33,430 --> 00:04:37,280
or how slow your code, your
software actually runs.

92
00:04:37,280 --> 00:04:39,700
So these are knobs that
can ultimately be turned.

93
00:04:39,700 --> 00:04:42,310
And through testing, can you
figure this out in advance

94
00:04:42,310 --> 00:04:46,470
by simulating traffic in order to
estimate exactly how many users you

95
00:04:46,470 --> 00:04:49,000
might be able to handle at a time?

96
00:04:49,000 --> 00:04:53,200
Now, the relevance to today is
that the cloud, so to speak,

97
00:04:53,200 --> 00:04:56,200
allows us to start to solve
some of these problems

98
00:04:56,200 --> 00:04:59,890
and also allows us to start
abstracting away the solutions to some

99
00:04:59,890 --> 00:05:00,640
of these problems.

100
00:05:00,640 --> 00:05:02,360
Well, let's see what
this actually means.

101
00:05:02,360 --> 00:05:04,449
So at some point or other--

102
00:05:04,449 --> 00:05:06,490
especially when it's not
just my laptop, but it's

103
00:05:06,490 --> 00:05:10,160
like 1,000 laptops, or 10,000 laptops
and desktops and phones and more

104
00:05:10,160 --> 00:05:12,860
that are somehow trying
to access my server here--

105
00:05:12,860 --> 00:05:17,000
at some point, we hit that upper
limit whereby no more users can

106
00:05:17,000 --> 00:05:19,200
fit onto my web site per unit of time.

107
00:05:19,200 --> 00:05:22,070
So what is the symptom that my
users experience at that point

108
00:05:22,070 --> 00:05:23,660
if I'm over capacity?

109
00:05:23,660 --> 00:05:26,210
Well, they might see an
error message of some sort.

110
00:05:26,210 --> 00:05:28,550
They might just
experience a spinning icon

111
00:05:28,550 --> 00:05:30,662
because the website is
super slow to respond.

112
00:05:30,662 --> 00:05:33,120
And maybe it does respond, but
maybe it's 10 seconds later.

113
00:05:33,120 --> 00:05:36,890
So at the end of the day, they either
have a bad experience or no experience

114
00:05:36,890 --> 00:05:42,480
whatsoever, because my server can only
handle so many requests at a time.

115
00:05:42,480 --> 00:05:44,570
So what do you do to solve this problem?

116
00:05:44,570 --> 00:05:48,500
If one server is not enough, maybe
the most intuitive solution is, well,

117
00:05:48,500 --> 00:05:51,500
if one server is not
giving me enough headroom,

118
00:05:51,500 --> 00:05:53,490
why don't I just have two servers?

119
00:05:53,490 --> 00:05:54,890
So let's go ahead and do that.

120
00:05:54,890 --> 00:05:58,370
Instead of having just one server,
let's go ahead and have two.

121
00:05:58,370 --> 00:06:01,790
And let me propose that on the second
server, it's the exact same software.

122
00:06:01,790 --> 00:06:05,150
So whatever code I've written, in
whatever language it's written,

123
00:06:05,150 --> 00:06:08,600
I just have copies of my web
site on both the original server

124
00:06:08,600 --> 00:06:10,520
and the second server.

125
00:06:10,520 --> 00:06:14,270
Now I've solved the
problem in the simple sense

126
00:06:14,270 --> 00:06:16,010
that I've doubled my capacity.

127
00:06:16,010 --> 00:06:18,570
If one server can handle
1,000 people per second,

128
00:06:18,570 --> 00:06:21,714
well, then surely two servers can
handle 2,000 people per second,

129
00:06:21,714 --> 00:06:22,880
so I've doubled my capacity.

130
00:06:22,880 --> 00:06:23,870
So that's good.

131
00:06:23,870 --> 00:06:26,000
I've hopefully solved the problem.

132
00:06:26,000 --> 00:06:27,929
But it's not quite as simple as that.

133
00:06:27,929 --> 00:06:30,845
At least pictorially, I'm still
pointing at just one of those servers,

134
00:06:30,845 --> 00:06:33,890
so we're going to have to clean
up this picture alone and somehow

135
00:06:33,890 --> 00:06:36,680
figure out how to get users--

136
00:06:36,680 --> 00:06:38,450
or more generally, traffic--

137
00:06:38,450 --> 00:06:40,820
to both of these servers.

138
00:06:40,820 --> 00:06:43,740
I could just naively
draw an arrow like this.

139
00:06:43,740 --> 00:06:45,380
But what does that actually mean?

140
00:06:45,380 --> 00:06:47,780
We don't want to abstract
away so much of the detail

141
00:06:47,780 --> 00:06:50,460
that we're ignoring this problem.

142
00:06:50,460 --> 00:06:54,860
How do we implement this notion of
choosing between left arrow and right

143
00:06:54,860 --> 00:06:55,580
arrow?

144
00:06:55,580 --> 00:06:59,210
Well, let's consider what
our solutions might be.

145
00:06:59,210 --> 00:07:02,930
If a user, like me on my laptop,
is trying to visit this web site--

146
00:07:02,930 --> 00:07:06,740
and the web site, ideally, is going
to live at something like example.com,

147
00:07:06,740 --> 00:07:10,122
or facebook.com, or
gmail.com, or whatever--

148
00:07:10,122 --> 00:07:12,830
I don't want to have to broadcast
different names for my servers.

149
00:07:12,830 --> 00:07:14,910
And you might actually
notice this on the internet.

150
00:07:14,910 --> 00:07:18,160
You might notice, if you start noticing
the URLs of websites you're visiting--

151
00:07:18,160 --> 00:07:21,290
especially for certain older, stodgier
companies who haven't necessarily

152
00:07:21,290 --> 00:07:23,240
implemented this in
the most modern way--

153
00:07:23,240 --> 00:07:27,239
you might find yourself not
just at www.something.com,

154
00:07:27,239 --> 00:07:29,780
but if you look closely, you
might find yourself occasionally

155
00:07:29,780 --> 00:07:35,310
at www1.something.com,
www2.something.com,

156
00:07:35,310 --> 00:07:38,590
or even www13.something.com.

157
00:07:38,590 --> 00:07:43,670
Which is to say that some companies
appear to solve this problem by just

158
00:07:43,670 --> 00:07:45,140
giving different names--

159
00:07:45,140 --> 00:07:48,920
similar names, but different names--
to their two servers, three servers,

160
00:07:48,920 --> 00:07:51,200
13 servers, or however many they have.

161
00:07:51,200 --> 00:07:54,770
And then they somehow redirect
users from their main domain

162
00:07:54,770 --> 00:08:00,440
name, www.something.com, to any one
of those two or three or 13 servers.

163
00:08:00,440 --> 00:08:01,919
But this isn't very elegant.

164
00:08:01,919 --> 00:08:03,710
The marketing folks
would surely hate this,

165
00:08:03,710 --> 00:08:06,770
because you're trying to build some
brand recognition around your URL.

166
00:08:06,770 --> 00:08:10,370
Why would you dirty it by just putting
these arbitrary numbers in the URLs?

167
00:08:10,370 --> 00:08:13,940
Plus if you fast forward
a bit in this story,

168
00:08:13,940 --> 00:08:16,610
if, for some reason down
the road, you get fancier,

169
00:08:16,610 --> 00:08:18,382
bigger servers that
can handle more users,

170
00:08:18,382 --> 00:08:20,090
and therefore you
don't need 13 of them--

171
00:08:20,090 --> 00:08:22,220
you can get away with just six of them--

172
00:08:22,220 --> 00:08:24,950
well, what happens if some of
your customers have bookmarked,

173
00:08:24,950 --> 00:08:30,320
very reasonably, one of those older
names, like www13.something.com?

174
00:08:30,320 --> 00:08:33,799
So now when they try to visit that
URL, gosh, they might hit a dead end.

175
00:08:33,799 --> 00:08:35,632
So you could solve
that in some other way.

176
00:08:35,632 --> 00:08:38,090
But the point is it would seem
to create a problem quickly,

177
00:08:38,090 --> 00:08:40,159
and it's just a naming mess.

178
00:08:40,159 --> 00:08:44,270
Why actually bother having
your users see something

179
00:08:44,270 --> 00:08:46,020
as messy as these numbered servers?

180
00:08:46,020 --> 00:08:49,400
It would be nice to do this
a little more transparently.

181
00:08:49,400 --> 00:08:50,930
So how could we do this?

182
00:08:50,930 --> 00:08:54,140
Well, let me propose that we
kind of need some middleman here,

183
00:08:54,140 --> 00:08:57,740
so to speak, whereby traffic comes
from people like me on the internet

184
00:08:57,740 --> 00:09:00,400
and then either goes to the
left or goes to the right,

185
00:09:00,400 --> 00:09:02,780
or no matter how many
servers we have, goes

186
00:09:02,780 --> 00:09:05,750
to one of those actual web servers.

187
00:09:05,750 --> 00:09:09,500
So how does this middleman-- and
to borrow some past terminology,

188
00:09:09,500 --> 00:09:12,320
how does this black
box potentially work?

189
00:09:12,320 --> 00:09:14,030
Well, let's consider
some of the building

190
00:09:14,030 --> 00:09:18,530
blocks, some of the puzzle pieces we
have technologically at our disposal

191
00:09:18,530 --> 00:09:19,400
now.

192
00:09:19,400 --> 00:09:21,770
You may recall that every
server on the internet

193
00:09:21,770 --> 00:09:25,641
has an IP address, an internet protocol
address, a unique address for it.

194
00:09:25,641 --> 00:09:27,890
And that's, again, a bit of
a white lie, because there

195
00:09:27,890 --> 00:09:30,410
are technologies by which
you can have private IP

196
00:09:30,410 --> 00:09:32,690
addresses that the
outside world doesn't see.

197
00:09:32,690 --> 00:09:35,780
But let's stipulate,
for today's purposes,

198
00:09:35,780 --> 00:09:38,810
that every computer on the
internet certainly has an IP

199
00:09:38,810 --> 00:09:41,550
address, whether public or private.

200
00:09:41,550 --> 00:09:46,400
So maybe, just maybe, we could
leverage an existing technology--

201
00:09:46,400 --> 00:09:48,680
DNS, the Domain Name System--

202
00:09:48,680 --> 00:09:52,880
so that rather than only return
one IP address of a server

203
00:09:52,880 --> 00:09:57,470
when you look up www.something.com,
we return the IP address

204
00:09:57,470 --> 00:09:59,610
of the server on the
left some of the time

205
00:09:59,610 --> 00:10:02,990
or the IP address of the server
on the right some of the time,

206
00:10:02,990 --> 00:10:07,160
effectively balancing our load,
our traffic across the two servers.

207
00:10:07,160 --> 00:10:10,060
And in fact, if you
do this 50-50, you can

208
00:10:10,060 --> 00:10:12,640
take, really, what's called
a round robin approach,

209
00:10:12,640 --> 00:10:17,270
and ideally uniformly distribute
your traffic across multiple servers.

210
00:10:17,270 --> 00:10:20,140
And what's nice in this model is
that because you're using DNS,

211
00:10:20,140 --> 00:10:22,472
the user doesn't really
notice what's going on.

212
00:10:22,472 --> 00:10:24,430
At the end of the day,
none of us humans really

213
00:10:24,430 --> 00:10:27,130
care what IP address we're
actually going to if we visit

214
00:10:27,130 --> 00:10:29,410
Facebook.com or Gmail.com or the like.

215
00:10:29,410 --> 00:10:33,730
We just care that our computer can find
that server or servers on the internet.

216
00:10:33,730 --> 00:10:38,020
So via DNS, we could, very
cleverly, via this middleman here,

217
00:10:38,020 --> 00:10:42,010
which is really just going to be some
third device, some separate server--

218
00:10:42,010 --> 00:10:45,940
it, as a DNS device, could
just respond to requests

219
00:10:45,940 --> 00:10:49,630
from customers with either this
IP address or this IP address,

220
00:10:49,630 --> 00:10:52,840
or any number of different IP addresses.

221
00:10:52,840 --> 00:10:56,130
So does this solve the problem?

222
00:10:56,130 --> 00:10:58,020
Again, most everything
in computer science

223
00:10:58,020 --> 00:11:01,200
would seem to be a tradeoff
at the end of the day.

224
00:11:01,200 --> 00:11:03,690
And this seems almost too
good to be true, perhaps.

225
00:11:03,690 --> 00:11:04,560
It's so simple.

226
00:11:04,560 --> 00:11:06,270
It leverages an existing technology.

227
00:11:06,270 --> 00:11:07,750
It just works.

228
00:11:07,750 --> 00:11:10,440
So what prices might we pay?

229
00:11:10,440 --> 00:11:14,325
Well, DNS, it turns out,
gets cached quite a bit.

230
00:11:14,325 --> 00:11:15,450
And what does caching mean?

231
00:11:15,450 --> 00:11:18,626
Caching something means
keeping some past answer--

232
00:11:18,626 --> 00:11:20,750
or more generally, piece
of information-- around so

233
00:11:20,750 --> 00:11:26,350
that you can access it more quickly the
second and the third time and beyond.

234
00:11:26,350 --> 00:11:30,360
And so computers today,
Macs and PCs, as well as

235
00:11:30,360 --> 00:11:33,480
servers on the internet, other
DNS servers on the internet,

236
00:11:33,480 --> 00:11:37,170
for performance reasons, will
often remember the responses

237
00:11:37,170 --> 00:11:38,910
that they get from DNS servers.

238
00:11:38,910 --> 00:11:42,690
For instance, if, on my Mac, I
visit Facebook.com, hypothetically

239
00:11:42,690 --> 00:11:47,370
a lot of times during the day, it's kind
of stupid if my laptop, again and again

240
00:11:47,370 --> 00:11:49,620
and again and again,
asks some DNS server

241
00:11:49,620 --> 00:11:52,110
for Facebook.com's IP
address if it already

242
00:11:52,110 --> 00:11:55,020
asked that same question an hour
ago-- or more realistically,

243
00:11:55,020 --> 00:11:57,390
two minutes ago, or something like that.

244
00:11:57,390 --> 00:12:01,590
It would be smarter if my operating
system-- or even my browser, Chrome

245
00:12:01,590 --> 00:12:03,540
or Firefox or whatever
I'm using-- actually

246
00:12:03,540 --> 00:12:08,210
remembers that answer for me so that
my computer can just pull up that web

247
00:12:08,210 --> 00:12:13,740
site faster by skipping a step, by
not wasting time asking a server again

248
00:12:13,740 --> 00:12:15,750
for the IP address of a server.

249
00:12:15,750 --> 00:12:19,860
And after all, IP addresses, it turns
out, generally don't change that often.

250
00:12:19,860 --> 00:12:23,880
It's certainly possible for a company
or a university or even a home user

251
00:12:23,880 --> 00:12:25,840
to change their computer's IP addresses.

252
00:12:25,840 --> 00:12:28,350
But the reality is it doesn't
change all that often.

253
00:12:28,350 --> 00:12:30,490
The common case is to
have the same IP address

254
00:12:30,490 --> 00:12:33,960
now as you might an hour from now,
or even a day or a week or a month

255
00:12:33,960 --> 00:12:34,860
from now.

256
00:12:34,860 --> 00:12:37,530
But the key thing is that it can change.

257
00:12:37,530 --> 00:12:41,600
And especially if you're worried about
customers-- not just some personal web

258
00:12:41,600 --> 00:12:43,260
site, but you might lose business.

259
00:12:43,260 --> 00:12:45,960
You might lose orders if users
can't visit your website.

260
00:12:45,960 --> 00:12:49,470
Anything that puts your
server's uptime, so to speak--

261
00:12:49,470 --> 00:12:51,480
being accessible on
the internet at risk--

262
00:12:51,480 --> 00:12:53,790
probably is worthy of
some consideration.

263
00:12:53,790 --> 00:12:57,884
So let me propose, then, that just one
of these servers goes offline somehow.

264
00:12:57,884 --> 00:12:58,800
Maybe it's deliberate.

265
00:12:58,800 --> 00:13:00,330
You need to do some service for it.

266
00:13:00,330 --> 00:13:03,270
Or maybe it crashed in some way,
or it got unplugged somehow,

267
00:13:03,270 --> 00:13:07,530
or something went wrong such that
now, one or more of your servers,

268
00:13:07,530 --> 00:13:11,940
across which you've been load balancing,
no longer can talk to the internet.

269
00:13:11,940 --> 00:13:12,960
What might happen?

270
00:13:12,960 --> 00:13:15,510
Well, if some customer's
Mac, like my own,

271
00:13:15,510 --> 00:13:20,310
has remembered or cached that
particular server's IP address,

272
00:13:20,310 --> 00:13:21,870
that is not a good situation.

273
00:13:21,870 --> 00:13:24,150
Because your Mac or PC
or whatever is going

274
00:13:24,150 --> 00:13:27,450
to now try to revisit your
web site again and again

275
00:13:27,450 --> 00:13:33,660
and again at that old cached IP address
that apparently can be a dead end.

276
00:13:33,660 --> 00:13:38,370
And so even though you still have
servers that could potentially

277
00:13:38,370 --> 00:13:41,550
handle that customer's
request, that customer's order,

278
00:13:41,550 --> 00:13:44,520
that customer's desire
to check out, he or she

279
00:13:44,520 --> 00:13:46,650
really is still not
going to be able to visit

280
00:13:46,650 --> 00:13:49,290
the website unless that cache expires.

281
00:13:49,290 --> 00:13:52,230
Maybe they reboot their computer
so that the cache forcibly expires.

282
00:13:52,230 --> 00:13:55,140
Maybe they just wait some amount
of time so that that IP address

283
00:13:55,140 --> 00:13:57,510
is forgotten by the browser
or by the operating system

284
00:13:57,510 --> 00:14:01,950
or by some other DNS server
until the new one's available IP

285
00:14:01,950 --> 00:14:03,690
addresses are picked up instead.

286
00:14:03,690 --> 00:14:04,830
But there is that risk.

287
00:14:04,830 --> 00:14:07,470
And I would argue that this
risk is even higher especially

288
00:14:07,470 --> 00:14:11,910
for companies that might be considering
moving their infrastructure from one

289
00:14:11,910 --> 00:14:13,200
service to another.

290
00:14:13,200 --> 00:14:16,890
If you're deliberately going to move
your servers from one IP address

291
00:14:16,890 --> 00:14:20,010
to another, as might happen if you
change cloud providers, so to speak--

292
00:14:20,010 --> 00:14:21,240
more on those in a minute--

293
00:14:21,240 --> 00:14:24,790
really, if you change the companies
that you're using to host your servers,

294
00:14:24,790 --> 00:14:26,460
your IP addresses will change.

295
00:14:26,460 --> 00:14:29,310
And you certainly don't want to
incur a huge amount of downtime

296
00:14:29,310 --> 00:14:30,550
in a situation like that.

297
00:14:30,550 --> 00:14:32,130
So there are these tradeoffs.

298
00:14:32,130 --> 00:14:35,040
Easy solution, technologically
pretty inexpensive to do.

299
00:14:35,040 --> 00:14:37,840
It just works using existing technology.

300
00:14:37,840 --> 00:14:40,960
But you open up yourselves to this risk.

301
00:14:40,960 --> 00:14:42,300
So let's address that.

302
00:14:42,300 --> 00:14:44,837
Putting back the old
proverbial engineering hat,

303
00:14:44,837 --> 00:14:46,170
let's try to solve this problem.

304
00:14:46,170 --> 00:14:48,870
It seems that giving a unique
IP address to this server

305
00:14:48,870 --> 00:14:52,110
and to this server, and any number
of other servers that are back there,

306
00:14:52,110 --> 00:14:56,400
might not be the smartest idea in
so far as those IPs can get cached.

307
00:14:56,400 --> 00:15:01,120
So what if we use DNS as follows?

308
00:15:01,120 --> 00:15:05,220
When my laptop or anyone else's requests
the IP address for www.something.com,

309
00:15:05,220 --> 00:15:08,860
why don't we return the IP
address of this device here--

310
00:15:08,860 --> 00:15:11,680
this load balancer, as
we'll start calling it,

311
00:15:11,680 --> 00:15:15,360
where a load balancer is
usually just a physical device,

312
00:15:15,360 --> 00:15:18,840
or multiple physical devices, whose
purpose in life is to balance load?

313
00:15:18,840 --> 00:15:22,140
Packets come in, and similar
in spirit to a router,

314
00:15:22,140 --> 00:15:25,860
they do route information to the left,
to the right, or some other direction.

315
00:15:25,860 --> 00:15:29,910
But their overarching purpose isn't just
to get data from point A to point B,

316
00:15:29,910 --> 00:15:32,970
but to somehow intelligently
balance that traffic

317
00:15:32,970 --> 00:15:37,860
over multiple possible destinations
for point B, identical servers

318
00:15:37,860 --> 00:15:39,390
in the case of our story here.

319
00:15:39,390 --> 00:15:42,810
So what if, instead, we addressed
this problem of potential downtime

320
00:15:42,810 --> 00:15:46,830
by returning the IP address
of the load balancer,

321
00:15:46,830 --> 00:15:49,560
and then, by nature of
private IP addresses

322
00:15:49,560 --> 00:15:52,110
or some other mechanism
that the end user does not

323
00:15:52,110 --> 00:15:56,920
need to know or care about, this load
balancer somehow routes the traffic

324
00:15:56,920 --> 00:16:00,110
to either the first device
or the second device,

325
00:16:00,110 --> 00:16:03,640
LB here being our load balancer?

326
00:16:03,640 --> 00:16:05,860
So we've seemed to have
solved this problem.

327
00:16:05,860 --> 00:16:08,920
In so far as now we have
configured our DNS servers

328
00:16:08,920 --> 00:16:12,250
to return the IP address
of the load balancer,

329
00:16:12,250 --> 00:16:15,760
there's no problem of downtime
as we described a moment ago.

330
00:16:15,760 --> 00:16:20,890
Because if Server 1 goes offline
for whatever reason, no big deal.

331
00:16:20,890 --> 00:16:25,510
The load balancer should hopefully
just notice that and subsequently start

332
00:16:25,510 --> 00:16:29,710
proactively routing all incoming data
that reaches its IP address to Server 2

333
00:16:29,710 --> 00:16:30,940
and not Server 1.

334
00:16:30,940 --> 00:16:32,819
now how does the load balancer know?

335
00:16:32,819 --> 00:16:34,360
Well, either a human could intervene.

336
00:16:34,360 --> 00:16:37,240
Maybe someone gets a late night
call or text or page saying,

337
00:16:37,240 --> 00:16:39,490
uh oh, server 1 is down,
you better do something.

338
00:16:39,490 --> 00:16:42,070
And then he or she can manually
configure the load balancer

339
00:16:42,070 --> 00:16:44,590
to no longer send any
traffic to Server 1.

340
00:16:44,590 --> 00:16:47,620
That seems kind of stupid in an age
of automation and smart software.

341
00:16:47,620 --> 00:16:48,730
Maybe we can do better.

342
00:16:48,730 --> 00:16:49,870
And indeed, we can.

343
00:16:49,870 --> 00:16:52,630
A technique that's
often used by servers is

344
00:16:52,630 --> 00:16:55,570
something modeled from
the human world to use

345
00:16:55,570 --> 00:16:58,690
what you might describe as
heartbeats to actually configure

346
00:16:58,690 --> 00:17:02,350
the load balancer and Servers
1 and 2 to operate as follows.

347
00:17:02,350 --> 00:17:05,740
Maybe every second, every half a
second, maybe every five seconds

348
00:17:05,740 --> 00:17:10,359
you configure Server 1 and Server 2
to send some kind of heartbeat message

349
00:17:10,359 --> 00:17:11,750
to the load balancer.

350
00:17:11,750 --> 00:17:14,770
This is just a TCP/IP packet,
some kind of network packet

351
00:17:14,770 --> 00:17:17,650
that's the equivalent
of saying I'm alive.

352
00:17:17,650 --> 00:17:18,339
I'm alive.

353
00:17:18,339 --> 00:17:23,770
Or more goofily, like boom, boom, boom,
boom, ergo the heartbeat metaphor.

354
00:17:23,770 --> 00:17:26,680
But the point is that 1 and 2,
and any number of other servers,

355
00:17:26,680 --> 00:17:29,710
should be configured to
just constantly reassure

356
00:17:29,710 --> 00:17:31,800
the load balancer that they are alive.

357
00:17:31,800 --> 00:17:32,770
They are accessible.

358
00:17:32,770 --> 00:17:34,780
They are ready to receive traffic.

359
00:17:34,780 --> 00:17:36,809
And the load balancer, similarly--

360
00:17:36,809 --> 00:17:39,100
and you might see where this
is going-- can very simply

361
00:17:39,100 --> 00:17:42,100
be configured to listen
for that heartbeat.

362
00:17:42,100 --> 00:17:46,060
And if it ever doesn't hear a
heartbeat from Server 1 or Server 2,

363
00:17:46,060 --> 00:17:48,790
it should just assume
that something is wrong.

364
00:17:48,790 --> 00:17:50,030
The server has died.

365
00:17:50,030 --> 00:17:50,830
It's gone offline.

366
00:17:50,830 --> 00:17:52,390
Something bad has happened.

367
00:17:52,390 --> 00:17:55,090
So the load balancer
subsequently should simply not

368
00:17:55,090 --> 00:17:58,030
route any traffic to
that particular server

369
00:17:58,030 --> 00:18:00,490
until some human or
some automated process

370
00:18:00,490 --> 00:18:04,300
brings the server back alive, so to
speak, and the heartbeat resumes.

371
00:18:04,300 --> 00:18:06,880
Now, of course, this problem
doesn't go away permanently.

372
00:18:06,880 --> 00:18:09,630
If servers 1 and 2 stop
emitting a heartbeat,

373
00:18:09,630 --> 00:18:11,447
we really have no capacity for users.

374
00:18:11,447 --> 00:18:13,030
But that would be an extreme scenario.

375
00:18:13,030 --> 00:18:17,714
Hopefully it's just one or a few of
our servers go offline in that way.

376
00:18:17,714 --> 00:18:20,380
So we can configure our servers
for these heartbeats, which is--

377
00:18:20,380 --> 00:18:21,190
think about it--

378
00:18:21,190 --> 00:18:25,100
a very simple physiologically-inspired
solution to a problem.

379
00:18:25,100 --> 00:18:27,730
And even if it's not obvious
how you implemented it in code,

380
00:18:27,730 --> 00:18:30,370
it really is just an algorithm,
a simple set of instructions

381
00:18:30,370 --> 00:18:33,700
with which we can solve this problem.

382
00:18:33,700 --> 00:18:36,690
And yet, damnit, we've
introduced a new problem.

383
00:18:36,690 --> 00:18:40,050
And so this really is
the old leaky hose,

384
00:18:40,050 --> 00:18:43,150
where just as we've plugged
one leak or solved one problem,

385
00:18:43,150 --> 00:18:46,420
another one has sprung up
somewhere else along the line.

386
00:18:46,420 --> 00:18:48,850
So what's the problem now?

387
00:18:48,850 --> 00:18:49,870
What's the problem now?

388
00:18:49,870 --> 00:18:54,520
The whole motivation of introducing
Server Number 2, in addition

389
00:18:54,520 --> 00:18:58,930
to Server Number 1, was to make
sure that we have enough capacity,

390
00:18:58,930 --> 00:19:03,070
and better yet, to make sure that if
Server 1 or Server 2 goes offline,

391
00:19:03,070 --> 00:19:05,650
the other one can hopefully
pick up the load unless it's

392
00:19:05,650 --> 00:19:09,320
a super busy time with lots and
lots of users visiting all at once.

393
00:19:09,320 --> 00:19:12,070
So in fact, the general
idea at play here

394
00:19:12,070 --> 00:19:20,770
is high availability ensuring
that if one server goes down,

395
00:19:20,770 --> 00:19:22,930
you have other servers
that can pick up the load.

396
00:19:22,930 --> 00:19:26,419
Being highly available means you
can be tolerant to issues like that.

397
00:19:26,419 --> 00:19:28,210
And then load balancing,
of course, is just

398
00:19:28,210 --> 00:19:31,390
the mere process of splitting the
load across those two endpoints.

399
00:19:31,390 --> 00:19:34,390
But we have introduced another problem.

400
00:19:34,390 --> 00:19:44,710
This might be abbreviated SPOF, or more
explicitly, Single Point Of Failure.

401
00:19:44,710 --> 00:19:48,040
Just as I've solved one problem
by introducing this load balancer,

402
00:19:48,040 --> 00:19:50,800
so have I introduced a new
problem, which is this.

403
00:19:50,800 --> 00:19:52,870
There is now, as you
might infer from the name

404
00:19:52,870 --> 00:19:54,730
alone, a single point of failure.

405
00:19:54,730 --> 00:19:59,050
It's fine that I can now tolerate
Server 1 or Server 2 going down,

406
00:19:59,050 --> 00:20:01,920
but what can I not tolerate, clearly?

407
00:20:01,920 --> 00:20:04,430
What if the load balancer goes down?

408
00:20:04,430 --> 00:20:06,160
So this is a very real concern.

409
00:20:06,160 --> 00:20:09,071
Maybe the load balancer
itself gets overloaded.

410
00:20:09,071 --> 00:20:11,320
Maybe the load balancer
itself has some kind of issue.

411
00:20:11,320 --> 00:20:13,270
And if the load balancer
goes down, it doesn't

412
00:20:13,270 --> 00:20:16,120
matter how many web
servers I have down here,

413
00:20:16,120 --> 00:20:19,900
or how much money I've spent down
here to ensure my high availability.

414
00:20:19,900 --> 00:20:25,100
My server is offline if this single
point of failure indeed fails.

415
00:20:25,100 --> 00:20:27,640
Now, you'd like to think
that the load balancer--

416
00:20:27,640 --> 00:20:30,070
especially since it only
has one job in life--

417
00:20:30,070 --> 00:20:33,270
can at least handle more traffic
than any individual server.

418
00:20:33,270 --> 00:20:36,490
Indeed, clearly, it must be
the case that the load balancer

419
00:20:36,490 --> 00:20:39,490
is fast enough and capable
enough to handle twice as

420
00:20:39,490 --> 00:20:41,840
much traffic as any individual server.

421
00:20:41,840 --> 00:20:46,999
But that's generally accepted as
feasible insofar as your website.

422
00:20:46,999 --> 00:20:48,790
Your real intellectual
property is probably

423
00:20:48,790 --> 00:20:50,590
doing a lot of work--
talking to a database,

424
00:20:50,590 --> 00:20:53,506
writing out files, downloading things,
or any number of other features

425
00:20:53,506 --> 00:20:56,800
that just take more effort than
just routing data from one server

426
00:20:56,800 --> 00:20:58,990
to another as a load balancer does.

427
00:20:58,990 --> 00:21:00,970
But it doesn't matter
how performant it is.

428
00:21:00,970 --> 00:21:04,240
If the load balancer breaks,
goes offline for some reason,

429
00:21:04,240 --> 00:21:08,120
your entire infrastructure
is inaccessible.

430
00:21:08,120 --> 00:21:09,980
So how do we solve this?

431
00:21:09,980 --> 00:21:13,510
How do we go about and
architect a solution to this?

432
00:21:13,510 --> 00:21:15,910
Well, how did we address
this issue earlier?

433
00:21:15,910 --> 00:21:20,260
We addressed the issue of insufficient
capacity or potential downtime

434
00:21:20,260 --> 00:21:22,960
by just throwing
hardware at the problem.

435
00:21:22,960 --> 00:21:25,940
And so maybe we could
do that same thing here.

436
00:21:25,940 --> 00:21:29,260
Maybe we could just introduce
a second load balancer.

437
00:21:29,260 --> 00:21:31,540
I'll call this LB as well.

438
00:21:31,540 --> 00:21:33,940
And now we somehow have to--

439
00:21:33,940 --> 00:21:39,640
I feel like we're just endlessly going
to be adding more and more rectangles

440
00:21:39,640 --> 00:21:40,540
to the picture.

441
00:21:40,540 --> 00:21:46,480
But somehow, we need to be able to load
balance across now two servers and two

442
00:21:46,480 --> 00:21:47,980
load balancers.

443
00:21:47,980 --> 00:21:48,860
So how do we do this?

444
00:21:48,860 --> 00:21:52,660
Well, let me clean this up so that we
have a bit more room to play with here

445
00:21:52,660 --> 00:21:57,260
and consider how a pair of load
balancers might actually work.

446
00:21:57,260 --> 00:22:01,510
So if my first server is here
and my second server is here,

447
00:22:01,510 --> 00:22:07,720
and I'm proposing now to have two load
balancers-- one here and one here--

448
00:22:07,720 --> 00:22:12,460
surely, both of these have to
be able to talk to both servers.

449
00:22:12,460 --> 00:22:15,100
So we already have this necessity.

450
00:22:15,100 --> 00:22:18,820
And somehow, traffic has
to come from the internet

451
00:22:18,820 --> 00:22:23,497
into this set of load balancers,
but probably only to one,

452
00:22:23,497 --> 00:22:25,330
because we don't want
to solve this with DNS

453
00:22:25,330 --> 00:22:27,370
and just have two IP
addresses out there.

454
00:22:27,370 --> 00:22:30,160
Because if one breaks, we
can recreate the same problem

455
00:22:30,160 --> 00:22:32,090
as before if we're not careful.

456
00:22:32,090 --> 00:22:33,140
So what if we do this?

457
00:22:33,140 --> 00:22:37,390
What if we use this building block
of heartbeats in another way as well?

458
00:22:37,390 --> 00:22:40,600
What if we ensure that
our load balancers--

459
00:22:40,600 --> 00:22:45,740
plural-- have just one IP
address, which a moment ago seemed

460
00:22:45,740 --> 00:22:47,240
to create a single point of failure?

461
00:22:47,240 --> 00:22:48,590
But what if we do this?

462
00:22:48,590 --> 00:22:52,330
What if we also allow the
load balancers to talk to,

463
00:22:52,330 --> 00:22:57,940
to communicate over a network with each
other so that one of the load balancers

464
00:22:57,940 --> 00:23:00,940
is constantly saying to
the other, I'm alive.

465
00:23:00,940 --> 00:23:02,020
I'm alive.

466
00:23:02,020 --> 00:23:03,290
I'm alive.

467
00:23:03,290 --> 00:23:06,310
And so what the load balancers
could be configured to do

468
00:23:06,310 --> 00:23:10,400
is that only one of them operates
at any given point in time.

469
00:23:10,400 --> 00:23:14,830
But if the other server,
the other load balancer,

470
00:23:14,830 --> 00:23:19,330
no longer hears from that primary load
balancer because of the heartbeats

471
00:23:19,330 --> 00:23:21,790
that are ideally both being
emitted in both directions

472
00:23:21,790 --> 00:23:25,150
so that they can both be
assured of the other's up time--

473
00:23:25,150 --> 00:23:29,110
if the secondary load balancer stops
hearing the primary load balancer,

474
00:23:29,110 --> 00:23:32,560
the secondary load balancer
can just presumptuously

475
00:23:32,560 --> 00:23:37,050
reconfigure itself to take on
that one and only IP address,

476
00:23:37,050 --> 00:23:39,760
effectively assuming that the
first load balancer is not going

477
00:23:39,760 --> 00:23:41,740
to be responding to any traffic anyway.

478
00:23:41,740 --> 00:23:46,120
And the second load balancer can
simply take on the entire load itself.

479
00:23:46,120 --> 00:23:49,750
But the key difference now
in this particular solution

480
00:23:49,750 --> 00:23:53,920
is that there's only one IP address that
describes this whole architecture, only

481
00:23:53,920 --> 00:23:56,740
one IP address between
the two load balancers

482
00:23:56,740 --> 00:24:01,210
so we don't risk those potential dead
ends that we had a little bit ago

483
00:24:01,210 --> 00:24:03,710
with our back end servers.

484
00:24:03,710 --> 00:24:08,884
So now it's starting to get more
robust, more highly available.

485
00:24:08,884 --> 00:24:09,800
So that's pretty good.

486
00:24:09,800 --> 00:24:11,800
We've solved most of these problems.

487
00:24:11,800 --> 00:24:17,590
We've generously, though, swept one
problem underneath the rug, whereby

488
00:24:17,590 --> 00:24:20,217
every time I draw another rectangle--

489
00:24:20,217 --> 00:24:22,300
not just the first time,
but now the second time--

490
00:24:22,300 --> 00:24:26,200
and add some interconnectivity,
somehow, among them someone

491
00:24:26,200 --> 00:24:27,850
somewhere is spending some money.

492
00:24:27,850 --> 00:24:30,340
And indeed, I am solving
these problems thus far

493
00:24:30,340 --> 00:24:33,800
by throwing money at the problem,
and frankly introducing complexity.

494
00:24:33,800 --> 00:24:36,400
Already look at how many
arrows or edges there

495
00:24:36,400 --> 00:24:40,060
are now, which might simply refer
to physical wires, which is fine.

496
00:24:40,060 --> 00:24:43,990
But there's also a logical
configuration that's now necessary.

497
00:24:43,990 --> 00:24:47,530
And God forbid we have a third load
balancer for extra high availability

498
00:24:47,530 --> 00:24:49,420
or any number of servers here--

499
00:24:49,420 --> 00:24:52,510
13 or 20 or 100 or 1,000 servers.

500
00:24:52,510 --> 00:24:54,910
It's a lot of cross-connections--
not just physically,

501
00:24:54,910 --> 00:24:58,120
but logically in terms of
the requisite configuration.

502
00:24:58,120 --> 00:25:01,540
So this complexity does add up.

503
00:25:01,540 --> 00:25:04,690
And the cost certainly adds up.

504
00:25:04,690 --> 00:25:07,240
And now, once upon a
time-- and not all that

505
00:25:07,240 --> 00:25:11,750
long ago-- if a company wanted to
architect this kind of solution,

506
00:25:11,750 --> 00:25:14,590
you would literally
buy two load balancers,

507
00:25:14,590 --> 00:25:17,290
and you would buy two
or more web servers,

508
00:25:17,290 --> 00:25:19,690
and you would buy the
requisite physical ethernet

509
00:25:19,690 --> 00:25:21,070
cables to interconnect the two.

510
00:25:21,070 --> 00:25:23,320
And you'd probably buy a
whole bunch of other hardware

511
00:25:23,320 --> 00:25:26,279
that we've not even talked about,
like firewalls and switches and more.

512
00:25:26,279 --> 00:25:28,361
But you would physically
buy all of this hardware.

513
00:25:28,361 --> 00:25:30,520
You would physically
connect all of this hardware

514
00:25:30,520 --> 00:25:35,710
and configure it to implement
these several kinds of features.

515
00:25:35,710 --> 00:25:38,410
But the catch is that the
more and more hardware

516
00:25:38,410 --> 00:25:42,310
you buy, just probabilistically,
the more and more you

517
00:25:42,310 --> 00:25:44,320
invite some kind of failure.

518
00:25:44,320 --> 00:25:46,000
Maybe it's some stupid human error.

519
00:25:46,000 --> 00:25:49,310
But more realistically, one of
your hard drives is going to fail.

520
00:25:49,310 --> 00:25:52,960
And hard drives are typically rated for
the enterprise in terms of Mean Time

521
00:25:52,960 --> 00:25:57,230
Between Failure, MTBF,
which generally means

522
00:25:57,230 --> 00:26:01,100
how long should you expect a hard drive
to work on average before it fails.

523
00:26:01,100 --> 00:26:01,600
It breaks.

524
00:26:01,600 --> 00:26:02,900
It just stops working.

525
00:26:02,900 --> 00:26:05,500
So if you have a whole bunch
of servers, each of which

526
00:26:05,500 --> 00:26:08,390
has a whole bunch of hard
drives, at some point,

527
00:26:08,390 --> 00:26:11,862
combinatorially, one or more of
those drives is just going to fail,

528
00:26:11,862 --> 00:26:13,820
which is to say you're
going to have a problem,

529
00:26:13,820 --> 00:26:15,850
and you're going to
have to fix it yourself.

530
00:26:15,850 --> 00:26:19,940
At some point, too, you're going
to run out of physical space.

531
00:26:19,940 --> 00:26:22,850
In fact, perhaps one of the
most constraining resources,

532
00:26:22,850 --> 00:26:25,530
especially for startups, is
the physical space itself.

533
00:26:25,530 --> 00:26:28,780
You probably don't want to start housing
your servers in your physical office,

534
00:26:28,780 --> 00:26:32,530
because you need a special room for
it, typically, with enough cooling,

535
00:26:32,530 --> 00:26:36,750
with enough access, with enough
electricity, and enough humans

536
00:26:36,750 --> 00:26:37,750
to actually maintain it.

537
00:26:37,750 --> 00:26:41,345
Or you graduate from your own office
space and go to a data center,

538
00:26:41,345 --> 00:26:44,230
a co-location facility,
whereby you maybe

539
00:26:44,230 --> 00:26:47,500
rent space in a physical
cage with a locking door,

540
00:26:47,500 --> 00:26:49,690
inside of which you
put racks of servers,

541
00:26:49,690 --> 00:26:54,100
just racked up on big metal poles,
and you pack as many servers in there

542
00:26:54,100 --> 00:26:54,970
as you can.

543
00:26:54,970 --> 00:26:57,610
But at some point, you're
going to be bumping up

544
00:26:57,610 --> 00:27:03,340
against other constrained resources--
physical space, actual power capacity,

545
00:27:03,340 --> 00:27:07,220
cooling, as well as the
humans to actually run this.

546
00:27:07,220 --> 00:27:10,720
And so very quickly
does operations, ops,

547
00:27:10,720 --> 00:27:14,890
so to speak, become an increasing
cost and an increasing challenge.

548
00:27:14,890 --> 00:27:19,130
And one of the most alluring
features of the cloud, so to speak,

549
00:27:19,130 --> 00:27:23,350
is that you can move all
of these details off-site.

550
00:27:23,350 --> 00:27:28,710
And you can abstract many of these,
let's say, implementation details

551
00:27:28,710 --> 00:27:32,770
away whereby you yourself don't have
to worry about the physical wires.

552
00:27:32,770 --> 00:27:35,260
You don't have to worry about
the make and model of servers

553
00:27:35,260 --> 00:27:36,051
that you're buying.

554
00:27:36,051 --> 00:27:39,700
You don't have to worry about
things actually breaking,

555
00:27:39,700 --> 00:27:42,790
because someone else will
deal with that for you.

556
00:27:42,790 --> 00:27:46,000
But you have to still understand
the topology and the architecture

557
00:27:46,000 --> 00:27:51,080
and the features that you want to
implement so that you can actually

558
00:27:51,080 --> 00:27:53,130
configure them in the cloud.

559
00:27:53,130 --> 00:27:55,830
So what do you actually
get from cloud providers?

560
00:27:55,830 --> 00:27:57,830
There's any number of
them out there these days.

561
00:27:57,830 --> 00:28:01,760
But perhaps three of the biggest
are Amazon, Google, and Microsoft,

562
00:28:01,760 --> 00:28:05,087
all of whom offer, these days, of
very similar palettes of options.

563
00:28:05,087 --> 00:28:06,920
And it's outright
overwhelming, if you visit

564
00:28:06,920 --> 00:28:10,290
each of their web sites, just how
many cloud products they offer.

565
00:28:10,290 --> 00:28:13,610
But they would generally offer
a number of standard products

566
00:28:13,610 --> 00:28:16,740
in the cloud-- for instance,
a virtualized server.

567
00:28:16,740 --> 00:28:19,430
So you don't have to physically
buy a server these days

568
00:28:19,430 --> 00:28:22,970
and plug it into your own ethernet
connection, your own internet

569
00:28:22,970 --> 00:28:24,470
connection in your own office.

570
00:28:24,470 --> 00:28:27,320
You can instead
essentially rent a server

571
00:28:27,320 --> 00:28:29,550
in the cloud, which is to
say that Amazon, Google,

572
00:28:29,550 --> 00:28:31,520
Microsoft, or any number
of other companies

573
00:28:31,520 --> 00:28:34,850
will host that server
physically for you,

574
00:28:34,850 --> 00:28:37,520
and they will take care of the
issues of power and cooling.

575
00:28:37,520 --> 00:28:40,061
And if a hard drive fails,
they will go remove the old one

576
00:28:40,061 --> 00:28:41,060
and plug in the new one.

577
00:28:41,060 --> 00:28:43,610
And ideally, they will provide
you with backup services.

578
00:28:43,610 --> 00:28:46,970
But more sophisticated
than that, they can also

579
00:28:46,970 --> 00:28:52,280
help us recreate, in software,
this kind of topology.

580
00:28:52,280 --> 00:28:56,750
In other words, even without having
a human physically wire together

581
00:28:56,750 --> 00:28:59,390
this kind of graph, so to speak,
that we've been building up

582
00:28:59,390 --> 00:29:02,600
here logically, thanks
to software these days,

583
00:29:02,600 --> 00:29:06,260
you can implement this whole paradigm--

584
00:29:06,260 --> 00:29:08,720
not with physical cables,
not with physical devices,

585
00:29:08,720 --> 00:29:10,610
but with software virtually.

586
00:29:10,610 --> 00:29:11,460
What does that mean?

587
00:29:11,460 --> 00:29:13,640
It means that humans, over
the past several years,

588
00:29:13,640 --> 00:29:17,810
have been writing software that mimics
the behavior of physical servers.

589
00:29:17,810 --> 00:29:21,290
Humans have been writing software
that mimics the behavior of a router.

590
00:29:21,290 --> 00:29:26,060
Humans have been writing software that
mimics the behavior of a load balancer.

591
00:29:26,060 --> 00:29:30,470
And implementing mimics the behavior
of-- really, we're just building,

592
00:29:30,470 --> 00:29:34,940
in software, what historically
might have been implemented entirely

593
00:29:34,940 --> 00:29:35,682
in hardware.

594
00:29:35,682 --> 00:29:37,640
And even that's a bit of
an oversimplification.

595
00:29:37,640 --> 00:29:40,077
Because even when something
is bought as hardware,

596
00:29:40,077 --> 00:29:43,160
there is, of course, software running
on that hardware that actually makes

597
00:29:43,160 --> 00:29:44,360
it do something.

598
00:29:44,360 --> 00:29:46,730
But they're no longer dedicated devices.

599
00:29:46,730 --> 00:29:50,750
You can use generic commodity
PC server hardware, really,

600
00:29:50,750 --> 00:29:54,740
and transform that hardware into
a certain role, a back end web

601
00:29:54,740 --> 00:29:58,550
server, a back end database, a
load balancer, a router, a switch,

602
00:29:58,550 --> 00:30:00,180
any number of other things.

603
00:30:00,180 --> 00:30:02,930
And so what you were getting from
companies like Amazon and Google

604
00:30:02,930 --> 00:30:06,230
and Microsoft and more is
the ability to build up

605
00:30:06,230 --> 00:30:09,050
your infrastructure in software.

606
00:30:09,050 --> 00:30:16,190
In fact, the buzzword here, the acronym,
is IaaS, Infrastructure as a Service.

607
00:30:16,190 --> 00:30:19,910
So you sign up for an account on any
of those companies' cloud services web

608
00:30:19,910 --> 00:30:23,030
sites, and you put in your credit
card information or your invoicing

609
00:30:23,030 --> 00:30:26,780
information, and you literally, via
a command line tool-- so a keyboard,

610
00:30:26,780 --> 00:30:30,380
or via a nice, web-based
graphical user interface, GUI--

611
00:30:30,380 --> 00:30:34,907
do you point and click and say, give
me two servers and one load balancer.

612
00:30:34,907 --> 00:30:36,740
Or if you have enough
money in the bank, you

613
00:30:36,740 --> 00:30:40,040
say give me two servers
and two load balancers

614
00:30:40,040 --> 00:30:41,990
configured for high availability.

615
00:30:41,990 --> 00:30:44,360
Or better yet, you
don't say any of that.

616
00:30:44,360 --> 00:30:48,410
You just tell the provider, give me a
web server and give me a load balancer,

617
00:30:48,410 --> 00:30:52,710
and you deal with the process of
scaling those things as needed.

618
00:30:52,710 --> 00:30:56,360
In fact, a buzzword de jeur is auto
scaling, which refers to a feature,

619
00:30:56,360 --> 00:30:59,720
implemented in software,
whereby if a cloud

620
00:30:59,720 --> 00:31:03,740
provider notices that your servers
are getting a lot of traffic--

621
00:31:03,740 --> 00:31:05,880
business is good, or
it's the holiday season,

622
00:31:05,880 --> 00:31:10,220
and you are bumping up against
just how many users your one or two

623
00:31:10,220 --> 00:31:12,230
or three or more servers can handle--

624
00:31:12,230 --> 00:31:17,030
auto-scaling is a feature that will
enable the cloud provider to just turn

625
00:31:17,030 --> 00:31:21,320
on, virtually, more servers for you
so that you go from two to three

626
00:31:21,320 --> 00:31:21,980
automatically.

627
00:31:21,980 --> 00:31:25,770
You can be happily asleep
in the middle of the night,

628
00:31:25,770 --> 00:31:28,780
and even though your traffic
is peaking, it doesn't matter.

629
00:31:28,780 --> 00:31:30,830
Your architecture is
going to auto scale.

630
00:31:30,830 --> 00:31:33,320
And better yet--
especially financially--

631
00:31:33,320 --> 00:31:37,820
if the cloud provider notices, maybe 12
hours later-- oh, all of your customers

632
00:31:37,820 --> 00:31:41,270
have gone to sleep, we don't really
need all of this excess capacity.

633
00:31:41,270 --> 00:31:43,400
Or maybe the holidays
are now in the past.

634
00:31:43,400 --> 00:31:45,200
You really don't need
this excess capacity.

635
00:31:45,200 --> 00:31:50,090
Auto scaling also dictates that those
servers can be virtually turned off.

636
00:31:50,090 --> 00:31:51,435
So you're no longer using them.

637
00:31:51,435 --> 00:31:53,060
You're no longer load bouncing to them.

638
00:31:53,060 --> 00:31:56,310
And most importantly, you're
no longer paying for them.

639
00:31:56,310 --> 00:32:00,010
So this is a really, really
nice value add at this point.

640
00:32:00,010 --> 00:32:02,780
There's no human crawling around
on the floor rewiring things

641
00:32:02,780 --> 00:32:04,130
and plugging in new servers.

642
00:32:04,130 --> 00:32:07,460
There's no finance person having to
approve the PO to actually order more

643
00:32:07,460 --> 00:32:09,220
servers just to increase your capacity.

644
00:32:09,220 --> 00:32:13,310
And most importantly, there is
no latency between the time when

645
00:32:13,310 --> 00:32:16,580
you notice, oh, my god, we're
getting really successful

646
00:32:16,580 --> 00:32:18,527
and can't handle our load-- uh oh.

647
00:32:18,527 --> 00:32:20,360
It's going to be a two,
three-week lead time

648
00:32:20,360 --> 00:32:22,460
before we can even get
in the more servers.

649
00:32:22,460 --> 00:32:25,910
Thanks to cloud computing, you can
literally log in to Amazon's, Google's,

650
00:32:25,910 --> 00:32:28,210
Microsoft's web site
and, click, click, click,

651
00:32:28,210 --> 00:32:32,360
have more server capacity
within seconds, within minutes,

652
00:32:32,360 --> 00:32:37,790
far faster than the physical
world traditionally allowed.

653
00:32:37,790 --> 00:32:40,460
So those are just some
of the features now

654
00:32:40,460 --> 00:32:45,320
that we gain from outsourcing
to the so-called cloud.

655
00:32:45,320 --> 00:32:48,510
So where does some of
this capability come from?

656
00:32:48,510 --> 00:32:51,960
Well, it turns out that
over the past many years,

657
00:32:51,960 --> 00:32:54,230
humans have been getting
better and better and better

658
00:32:54,230 --> 00:32:58,460
at packing more physical hardware
into the same form factor,

659
00:32:58,460 --> 00:32:59,720
into the same physical space.

660
00:32:59,720 --> 00:33:02,420
So at the level of CPUs,
the brains of a computer,

661
00:33:02,420 --> 00:33:05,810
we humans have gotten much better at
packing more and more transistors,

662
00:33:05,810 --> 00:33:07,520
for instance, onto a CPU.

663
00:33:07,520 --> 00:33:11,070
And transistors are the little switches
that can turn things on and off--

664
00:33:11,070 --> 00:33:12,670
0 and 1, 1 and 0.

665
00:33:12,670 --> 00:33:14,570
So you can store more
information and you

666
00:33:14,570 --> 00:33:17,260
can do more with that
information more quickly.

667
00:33:17,260 --> 00:33:19,820
CPUs today also have
more cores, which you

668
00:33:19,820 --> 00:33:23,180
can think of as mini CPUs
inside of the main CPU,

669
00:33:23,180 --> 00:33:25,550
so that a computer with
multiple cores can literally

670
00:33:25,550 --> 00:33:28,280
do multiple things at a time.

671
00:33:28,280 --> 00:33:32,060
But the funny thing is that we
humans, over the past decade or two,

672
00:33:32,060 --> 00:33:35,150
really haven't been getting
fundamentally faster at life.

673
00:33:35,150 --> 00:33:38,090
At the end of the day, I can
only check my email so quickly.

674
00:33:38,090 --> 00:33:39,990
I can only post on Facebook so quickly.

675
00:33:39,990 --> 00:33:43,670
I can only check out
from Amazon so quickly.

676
00:33:43,670 --> 00:33:47,784
Because we humans have, of course,
a finite speed to ourselves.

677
00:33:47,784 --> 00:33:49,950
We're not just getting--
we're not doubling in speed

678
00:33:49,950 --> 00:33:52,790
a la Moore's law every year or two.

679
00:33:52,790 --> 00:33:57,140
So we have, it would seem, a lot of
excess computing capacity these days.

680
00:33:57,140 --> 00:34:00,080
Computers are getting so darn
fast, we don't necessarily

681
00:34:00,080 --> 00:34:03,830
know what to do with all of these
CPU cycles and with all of the RAM

682
00:34:03,830 --> 00:34:06,860
that we can fit into the same
physical box at half the price

683
00:34:06,860 --> 00:34:08,989
that it cost us last year.

684
00:34:08,989 --> 00:34:12,469
And so manufacturers
and companies realize

685
00:34:12,469 --> 00:34:17,179
that we could actually build a
business on this increased capacity.

686
00:34:17,179 --> 00:34:23,277
We can implement the computer
equivalent of timesharing, so to speak,

687
00:34:23,277 --> 00:34:25,610
which has long been with us
in the history of computing.

688
00:34:25,610 --> 00:34:27,620
But we can do this on a
much more massive scale

689
00:34:27,620 --> 00:34:33,679
now by taking one physical server
that has maybe two CPUs, or 16 CPUs,

690
00:34:33,679 --> 00:34:38,570
or 64 CPUs, and maybe gigabytes--

691
00:34:38,570 --> 00:34:41,090
tens of gigabytes or hundreds
of gigabytes of RAM--

692
00:34:41,090 --> 00:34:45,110
all inside of the same physical device,
plug it in to an internet connection,

693
00:34:45,110 --> 00:34:50,810
and then run special software on that
one server that creates the illusion

694
00:34:50,810 --> 00:34:54,920
that there's multiple servers
living inside of that box.

695
00:34:54,920 --> 00:34:58,850
And this virtualization
software is implemented

696
00:34:58,850 --> 00:35:02,480
by way of software called a virtual
machine, or virtual machine monitor,

697
00:35:02,480 --> 00:35:04,430
or another word might be hypervisor.

698
00:35:04,430 --> 00:35:07,190
There's different ways to describe
essentially the same thing.

699
00:35:07,190 --> 00:35:11,390
But a virtual machine
is a piece of software

700
00:35:11,390 --> 00:35:15,590
running on a computer inside of which
is running some other operating system,

701
00:35:15,590 --> 00:35:16,370
typically.

702
00:35:16,370 --> 00:35:19,070
So you might have one
server running Windows.

703
00:35:19,070 --> 00:35:24,620
But inside of that server are multiple
virtual machines, each of which

704
00:35:24,620 --> 00:35:25,880
itself is running Windows.

705
00:35:25,880 --> 00:35:29,509
So you might be able to chop up one
computer into 10, or even into 100.

706
00:35:29,509 --> 00:35:31,550
Or perhaps more commonly,
you might have a server

707
00:35:31,550 --> 00:35:34,340
running Linux or some
Unix-based operating system,

708
00:35:34,340 --> 00:35:35,930
also with virtual machines on it.

709
00:35:35,930 --> 00:35:37,721
But those virtual
machines might be running

710
00:35:37,721 --> 00:35:42,797
Linux themselves, or Unix, or Windows,
or any number of versions of Windows.

711
00:35:42,797 --> 00:35:43,880
And so this is the beauty.

712
00:35:43,880 --> 00:35:48,080
When you have so much
excess capacity and so many

713
00:35:48,080 --> 00:35:50,150
available CPU cycles
and so much RAM, you

714
00:35:50,150 --> 00:35:56,490
can slice that up and then sell portions
of the server's capacity to customers.

715
00:35:56,490 --> 00:36:01,310
And if you're really clever, you might
look at your customers' usage patterns

716
00:36:01,310 --> 00:36:05,810
and realize that, you know
what, it's not necessarily

717
00:36:05,810 --> 00:36:11,360
as simple as just taking my server and
dividing it up into n different slices,

718
00:36:11,360 --> 00:36:13,700
where n is a generic
variable for number,

719
00:36:13,700 --> 00:36:17,824
and then selling it or renting that
space, really, to end customers.

720
00:36:17,824 --> 00:36:18,740
Because you know what?

721
00:36:18,740 --> 00:36:21,800
Some of those customers might have some
booming businesses, which is great.

722
00:36:21,800 --> 00:36:24,110
But some of those customers
might not have many users.

723
00:36:24,110 --> 00:36:26,129
Maybe it's a few dozen.

724
00:36:26,129 --> 00:36:27,170
Maybe it's a few hundred.

725
00:36:27,170 --> 00:36:29,280
But it's really a drop in the bucket.

726
00:36:29,280 --> 00:36:34,580
So instead of selling my computing
resources to just end customers,

727
00:36:34,580 --> 00:36:37,730
maybe I'll sell it to twice as
many customers or three times

728
00:36:37,730 --> 00:36:41,690
as many customers, and essentially
over-sell my server's capacity,

729
00:36:41,690 --> 00:36:44,480
but expect that on
average, this is just going

730
00:36:44,480 --> 00:36:47,570
to work out because some customers
will be using a lot of those cycles

731
00:36:47,570 --> 00:36:49,460
because business is
good, and some won't be,

732
00:36:49,460 --> 00:36:51,501
because it's just they
don't have many customers,

733
00:36:51,501 --> 00:36:55,580
or really, it's a personal website
that doesn't get much usage anyway.

734
00:36:55,580 --> 00:36:58,490
And so for some time,
there has, of course,

735
00:36:58,490 --> 00:37:01,700
been this risk, when you sign up
for a web hosting company or a cloud

736
00:37:01,700 --> 00:37:05,450
provider, that your web site actually
might get really slow for reasons

737
00:37:05,450 --> 00:37:07,020
outside of your control.

738
00:37:07,020 --> 00:37:11,780
If you are co-located on a server that
some other booming business is on,

739
00:37:11,780 --> 00:37:17,120
your users might actually suffer if
your web host has oversold itself.

740
00:37:17,120 --> 00:37:19,249
And so in fact, this is
one of those situations

741
00:37:19,249 --> 00:37:20,540
where you get what you pay for.

742
00:37:20,540 --> 00:37:23,990
If you're googling around and
finding various cloud providers,

743
00:37:23,990 --> 00:37:26,360
or web hosting companies
more specifically,

744
00:37:26,360 --> 00:37:30,050
you might be able to find a deal,
like $10 per month or $50 per month,

745
00:37:30,050 --> 00:37:33,590
as opposed to $100 or
$200 or more per month.

746
00:37:33,590 --> 00:37:37,070
And you do get what you pay for, because
those fly-by-night operations that

747
00:37:37,070 --> 00:37:41,960
are selling you space and
capacity super cheaply probably

748
00:37:41,960 --> 00:37:44,000
are overselling and over-committing.

749
00:37:44,000 --> 00:37:45,740
So these are the trade-offs, too--

750
00:37:45,740 --> 00:37:48,590
how much money do you want
to save versus how much risk

751
00:37:48,590 --> 00:37:50,300
do you actually want to take on?

752
00:37:50,300 --> 00:37:53,549
Generally, it's safer to go with some
of the bigger fish these days, certainly

753
00:37:53,549 --> 00:37:57,500
when building a business, as you might
on a company like Amazon or Google

754
00:37:57,500 --> 00:38:00,300
or Microsoft or derivatives thereof.

755
00:38:00,300 --> 00:38:02,540
So just to paint a more
concrete technical picture

756
00:38:02,540 --> 00:38:06,534
of what virtualization is, here's a
picture, as you might think of it.

757
00:38:06,534 --> 00:38:08,450
So you have your physical
infrastructure here.

758
00:38:08,450 --> 00:38:12,080
So that's the actual server
from Dell or IBM or whoever.

759
00:38:12,080 --> 00:38:14,870
Then you have the host operating
system, which might be Windows,

760
00:38:14,870 --> 00:38:18,896
but is often Linux or some
variant of Unix instead.

761
00:38:18,896 --> 00:38:20,270
And then you have the hypervisor.

762
00:38:20,270 --> 00:38:22,940
This is the piece of
software that you install

763
00:38:22,940 --> 00:38:27,800
on your server that allows you to run
multiple virtual machines on top of it.

764
00:38:27,800 --> 00:38:29,995
And those virtual machines
can each run any number

765
00:38:29,995 --> 00:38:32,870
of different operating systems
themselves, or even different versions

766
00:38:32,870 --> 00:38:34,130
of operating systems.

767
00:38:34,130 --> 00:38:38,359
And so depicted here up top are
the disparate guest OS operating

768
00:38:38,359 --> 00:38:39,650
systems that might be on there.

769
00:38:39,650 --> 00:38:42,670
Maybe this is Linux and Solaris,
and this is Windows itself,

770
00:38:42,670 --> 00:38:44,170
or any number of other combinations.

771
00:38:44,170 --> 00:38:47,870
Whatever your customers want or whatever
you want to provide or essentially

772
00:38:47,870 --> 00:38:50,810
rent to customers, you can install.

773
00:38:50,810 --> 00:38:52,470
But you do pay a price.

774
00:38:52,470 --> 00:38:54,920
So as beautiful as this
situation is, and as clever

775
00:38:54,920 --> 00:38:59,180
as it is that we're leveraging
these excess resources by slicing up

776
00:38:59,180 --> 00:39:04,190
one server into the illusion of, in this
case, three, or more generally more,

777
00:39:04,190 --> 00:39:05,780
there is some overhead.

778
00:39:05,780 --> 00:39:10,670
Because this hypervisor has to be a
middleman between your guest operating

779
00:39:10,670 --> 00:39:13,562
systems and your host operating
system, the one actually

780
00:39:13,562 --> 00:39:15,020
physically installed on the server.

781
00:39:15,020 --> 00:39:17,600
And any layers of indirection
like this, so to speak,

782
00:39:17,600 --> 00:39:19,380
have got to cost you
some amount of time.

783
00:39:19,380 --> 00:39:21,500
If there's some work being
done here and you only

784
00:39:21,500 --> 00:39:23,780
have a finite number of
resources, the hypervisor

785
00:39:23,780 --> 00:39:27,050
itself is surely consuming
some of your resources.

786
00:39:27,050 --> 00:39:28,970
And gosh, this just
seems really inefficient,

787
00:39:28,970 --> 00:39:32,900
especially if all of your customers
are using the same operating system.

788
00:39:32,900 --> 00:39:38,300
My god, why do you have to have copies
of the same OS multiply installed?

789
00:39:38,300 --> 00:39:42,920
This just doesn't feel like it's
leveraging much economy of scale.

790
00:39:42,920 --> 00:39:46,940
And so it turns out there's a newer
technology that's gaining steam,

791
00:39:46,940 --> 00:39:51,260
and this is known not as virtualization,
per se, but containerization,

792
00:39:51,260 --> 00:39:54,980
the most popular instance of which
is perhaps a company called Docker.

793
00:39:54,980 --> 00:39:57,950
And the world of Docker
is a little shorter.

794
00:39:57,950 --> 00:40:00,585
It's a little smarter about
how resources are shared.

795
00:40:00,585 --> 00:40:02,960
You still have your infrastructure,
your physical server,

796
00:40:02,960 --> 00:40:04,876
and you still have your
host operating system,

797
00:40:04,876 --> 00:40:08,100
whether it's Linux or Unix
or something like that.

798
00:40:08,100 --> 00:40:11,810
But then instead of a hypervisor,
you have the Docker engine,

799
00:40:11,810 --> 00:40:15,680
which is really just an equivalent
of that base layer of software.

800
00:40:15,680 --> 00:40:17,930
But notice what's different.

801
00:40:17,930 --> 00:40:22,387
In this case here, we've
collapsed the previous picture.

802
00:40:22,387 --> 00:40:25,220
In fact, thanks to our friends at
Docker who put this together here,

803
00:40:25,220 --> 00:40:27,800
the guest OS has disappeared.

804
00:40:27,800 --> 00:40:29,940
And you instead have your
different applications

805
00:40:29,940 --> 00:40:31,730
and your different
binaries and libraries,

806
00:40:31,730 --> 00:40:34,820
as this abbreviation means, all
running on the Docker engine.

807
00:40:34,820 --> 00:40:36,180
Now, what does this mean?

808
00:40:36,180 --> 00:40:38,660
This means when running
Docker, you typically

809
00:40:38,660 --> 00:40:40,370
choose your operating system--

810
00:40:40,370 --> 00:40:44,920
for instance, Ubuntu Linux or Debian
Linux or something else altogether--

811
00:40:44,920 --> 00:40:49,400
and then you essentially share
that one operating system

812
00:40:49,400 --> 00:40:52,610
across multiple containers.

813
00:40:52,610 --> 00:40:55,100
Instead of virtual machines,
we now have containers.

814
00:40:55,100 --> 00:40:58,460
So in other words, you ensure
that your different slices all

815
00:40:58,460 --> 00:41:01,850
share some common software--
the kernel, so to speak,

816
00:41:01,850 --> 00:41:05,060
the base core of the operating system.

817
00:41:05,060 --> 00:41:09,080
But then you uniquely layer
on top of that base system,

818
00:41:09,080 --> 00:41:13,520
that base set of default files, whatever
customizations your customers or you

819
00:41:13,520 --> 00:41:16,880
yourself want, but you
share some of the resources.

820
00:41:16,880 --> 00:41:19,700
And long story short, what
this means is that containers

821
00:41:19,700 --> 00:41:21,740
tend to be a little lighter weight.

822
00:41:21,740 --> 00:41:25,490
There's less waste of resources because
there's less overhead of running them,

823
00:41:25,490 --> 00:41:29,720
which is to say that you can generally
start them even more quickly.

824
00:41:29,720 --> 00:41:34,070
And better yet, you can still
isolate your different products

825
00:41:34,070 --> 00:41:37,190
and your different services--
database and web server and email

826
00:41:37,190 --> 00:41:38,900
server and any number
of other features--

827
00:41:38,900 --> 00:41:43,340
all within the illusion of their own
installation, their own operating

828
00:41:43,340 --> 00:41:47,030
system, even though there are
some shared resources here.

829
00:41:47,030 --> 00:41:51,950
So this, too, has been made possible
by the capabilities of modern hardware

830
00:41:51,950 --> 00:41:54,590
and the cleverness, frankly,
of humans in actually

831
00:41:54,590 --> 00:42:01,470
finding solutions or creative uses
for those available resources.

832
00:42:01,470 --> 00:42:05,190
But what other features
or topics come into play

833
00:42:05,190 --> 00:42:07,400
in this world of cloud computing?

834
00:42:07,400 --> 00:42:11,051
We've talked about availability
and caching and costing, really

835
00:42:11,051 --> 00:42:13,050
figuring out where we're
going to actually spend

836
00:42:13,050 --> 00:42:17,430
our money by throwing hardware at
problems and scaling more generally.

837
00:42:17,430 --> 00:42:19,830
But there's also issues
of replication, which

838
00:42:19,830 --> 00:42:22,890
actually do relate to high
availability, so to speak.

839
00:42:22,890 --> 00:42:24,900
But replication refers
to duplication of data,

840
00:42:24,900 --> 00:42:27,360
and really backups more
generally as a topic.

841
00:42:27,360 --> 00:42:30,570
And then there's also some other funky
acronyms that are very much in vogue

842
00:42:30,570 --> 00:42:31,230
these days.

843
00:42:31,230 --> 00:42:33,390
Besides Infrastructure
as a Service, there's

844
00:42:33,390 --> 00:42:40,280
also Platform as a Service, PaaS,
or Software as a Service, SaaS.

845
00:42:40,280 --> 00:42:43,920
Now, SaaS, even if you've
not used it under this name,

846
00:42:43,920 --> 00:42:45,360
odds are you have been using it.

847
00:42:45,360 --> 00:42:50,370
If you do use Gmail or Outlook.com
or any web-based email service,

848
00:42:50,370 --> 00:42:52,740
you are using software as a service.

849
00:42:52,740 --> 00:42:55,572
You don't really know, or need
to care, where in the world

850
00:42:55,572 --> 00:42:58,530
your emails physically live, or how
many servers they're spread across,

851
00:42:58,530 --> 00:43:00,821
or how your data is backed
up, or for that matter, when

852
00:43:00,821 --> 00:43:03,960
you click Send, how the email
even gets from point A to point B.

853
00:43:03,960 --> 00:43:07,710
You are treating Gmail
and Outlook as a software

854
00:43:07,710 --> 00:43:12,690
as a service with all of the underlying
implementation details abstracted away.

855
00:43:12,690 --> 00:43:15,870
You just don't know or care
how it's implemented-- well,

856
00:43:15,870 --> 00:43:17,760
at least if everything is working.

857
00:43:17,760 --> 00:43:21,090
You probably do care
if something goes down.

858
00:43:21,090 --> 00:43:24,870
But there's this intermediate
step between this extreme form

859
00:43:24,870 --> 00:43:29,010
of abstraction where all you see
is just the top-level service.

860
00:43:29,010 --> 00:43:32,100
And the lowest level
implementation that we've

861
00:43:32,100 --> 00:43:34,110
discussed, which is
infrastructure as a service,

862
00:43:34,110 --> 00:43:36,690
whereby when using
something like Amazon,

863
00:43:36,690 --> 00:43:39,720
you literally click the button
that says give me a load balancer.

864
00:43:39,720 --> 00:43:42,214
You literally click a button
that says give me two servers.

865
00:43:42,214 --> 00:43:44,130
You literally click a
button that says give me

866
00:43:44,130 --> 00:43:46,560
a firewall or any number
of other features.

867
00:43:46,560 --> 00:43:49,560
So Amazon and Microsoft
and Google, to some extent,

868
00:43:49,560 --> 00:43:52,590
have all implemented
these low-level services

869
00:43:52,590 --> 00:43:55,050
that still require that you
understand the technology,

870
00:43:55,050 --> 00:43:59,250
and you understand networking, and you
understand scaling and availability.

871
00:43:59,250 --> 00:44:03,930
But you so much more easily and
inexpensively and efficiently--

872
00:44:03,930 --> 00:44:08,960
literally with just a laptop or desktop,
without any data center of your own--

873
00:44:08,960 --> 00:44:12,450
stitch together the topology or
the architecture that you actually

874
00:44:12,450 --> 00:44:14,640
want, albeit in the cloud.

875
00:44:14,640 --> 00:44:17,820
Platform as a service, though,
has arisen as a middle ground

876
00:44:17,820 --> 00:44:20,557
here, whereby you might
have services like Herouku,

877
00:44:20,557 --> 00:44:22,890
which you might have heard
of, which themselves actually

878
00:44:22,890 --> 00:44:28,950
run on infrastructures like Amazon
or Google or Microsoft or the like.

879
00:44:28,950 --> 00:44:32,100
But they provide themselves
a layer of abstraction

880
00:44:32,100 --> 00:44:34,650
that isn't quite as high
level, so to speak, as what

881
00:44:34,650 --> 00:44:36,510
you get from software as a service.

882
00:44:36,510 --> 00:44:40,500
In fact, these platforms as a service
don't provide you with applications.

883
00:44:40,500 --> 00:44:44,350
They just make it easier for you to
run your applications in the cloud.

884
00:44:44,350 --> 00:44:45,550
Now, what does that mean?

885
00:44:45,550 --> 00:44:49,740
Well, it's all fun and exciting
to understand load balancing

886
00:44:49,740 --> 00:44:52,020
and understand networking
and understand the need

887
00:44:52,020 --> 00:44:56,580
for multiple servers and the entire
conversation that we've had thus far.

888
00:44:56,580 --> 00:44:59,390
But at the end of the day,
if I'm a software developer

889
00:44:59,390 --> 00:45:01,380
or I'm trying to build
a business, all I care

890
00:45:01,380 --> 00:45:05,940
about is making my internet
application available to real users.

891
00:45:05,940 --> 00:45:09,390
I really don't care about
how many servers I have,

892
00:45:09,390 --> 00:45:12,802
how many databases I have, how the
load balancers talk to one another.

893
00:45:12,802 --> 00:45:14,760
That's all fine and
intellectually interesting.

894
00:45:14,760 --> 00:45:17,230
But I just want to get real work done.

895
00:45:17,230 --> 00:45:19,140
So I'm willing to pay
a bit more for this.

896
00:45:19,140 --> 00:45:22,590
I'm willing to pay some middleman,
like a Herouku, or any number

897
00:45:22,590 --> 00:45:24,870
of other services, a
platform as a service,

898
00:45:24,870 --> 00:45:27,450
to abstract away those kinds of details.

899
00:45:27,450 --> 00:45:30,180
So I have the wherewithal,
and I have the willingness

900
00:45:30,180 --> 00:45:33,740
to actually say host
this as a web server.

901
00:45:33,740 --> 00:45:34,890
So give me a web server.

902
00:45:34,890 --> 00:45:37,920
I will pay you some number of dollars
per month to give me a web server.

903
00:45:37,920 --> 00:45:41,310
But I want you, Herouku, to deal
with the auto scaling of it.

904
00:45:41,310 --> 00:45:43,330
I don't care how many servers it is.

905
00:45:43,330 --> 00:45:44,910
I don't care how they are connected.

906
00:45:44,910 --> 00:45:46,784
I don't care anything
about these heartbeats.

907
00:45:46,784 --> 00:45:49,880
I just want to have the
illusion, for my own sake,

908
00:45:49,880 --> 00:45:53,880
of just one server that
somehow grows or shrinks

909
00:45:53,880 --> 00:45:56,400
dynamically to handle my customer base.

910
00:45:56,400 --> 00:45:59,381
Meanwhile, things like load
balancing, I just want my customers

911
00:45:59,381 --> 00:46:00,630
to be able to reach my server.

912
00:46:00,630 --> 00:46:02,046
I don't care how it's implemented.

913
00:46:02,046 --> 00:46:04,470
I don't care how it's made
to be highly available.

914
00:46:04,470 --> 00:46:06,120
I just want that to work.

915
00:46:06,120 --> 00:46:10,050
And so companies like Herouku
provide these platforms

916
00:46:10,050 --> 00:46:12,924
as a service that just make
your life a little bit easier.

917
00:46:12,924 --> 00:46:15,840
And you don't have to think about
or know about or worry about as many

918
00:46:15,840 --> 00:46:16,560
of these details.

919
00:46:16,560 --> 00:46:18,140
Now, to be fair, if
something breaks, you

920
00:46:18,140 --> 00:46:20,140
might not understand
exactly what's going wrong,

921
00:46:20,140 --> 00:46:22,200
and you yourself might
not be able to solve it.

922
00:46:22,200 --> 00:46:27,810
Indeed, you might be entirely at the
mercy of the cloud provider, or the PAS

923
00:46:27,810 --> 00:46:30,150
provider, to solve the problem for you.

924
00:46:30,150 --> 00:46:32,280
But you're saving time.

925
00:46:32,280 --> 00:46:34,132
You're saving energy
elsewhere by not having

926
00:46:34,132 --> 00:46:36,840
to worry about those lower-level
implementation details, at least

927
00:46:36,840 --> 00:46:37,631
in the common case.

928
00:46:37,631 --> 00:46:42,030
But odds are you're paying a little more
to Herouku than you would to an Amazon

929
00:46:42,030 --> 00:46:46,480
directly because they're providing
you with this value-added service.

930
00:46:46,480 --> 00:46:48,900
So as cryptic as these
acronyms really mean,

931
00:46:48,900 --> 00:46:51,210
they're really just
referring to disparate levels

932
00:46:51,210 --> 00:46:54,300
of abstraction, all of which
somehow relate to the cloud.

933
00:46:54,300 --> 00:46:56,790
But infrastructure as a
service is a virtualization

934
00:46:56,790 --> 00:46:59,880
of these hardware ideas,
the physical cabling

935
00:46:59,880 --> 00:47:01,770
that we drew here on the screen.

936
00:47:01,770 --> 00:47:04,080
Software as a service really
is just that application

937
00:47:04,080 --> 00:47:05,288
that the user interacts with.

938
00:47:05,288 --> 00:47:09,090
And platform as a service
is an intermediate step,

939
00:47:09,090 --> 00:47:12,180
whereby you, in building
your software in the cloud,

940
00:47:12,180 --> 00:47:16,710
can worry a little bit about how to
actually make it available to users.

941
00:47:16,710 --> 00:47:19,320
But let's consider one
other challenge now--

942
00:47:19,320 --> 00:47:23,310
that of database replication
since, of course, thus far,

943
00:47:23,310 --> 00:47:26,340
we've been talking about a web server
as though it's the entire picture.

944
00:47:26,340 --> 00:47:28,560
But the reality is
most any business that

945
00:47:28,560 --> 00:47:31,002
has a web-based presence
or a mobile presence

946
00:47:31,002 --> 00:47:32,460
is going to be storing information.

947
00:47:32,460 --> 00:47:35,249
When users register, when
users check something out,

948
00:47:35,249 --> 00:47:38,040
add something to their shopping
cart, so to speak, all of that data

949
00:47:38,040 --> 00:47:40,560
needs to somehow be stored.

950
00:47:40,560 --> 00:47:44,830
So let's consider now what the
world really likely looks like.

951
00:47:44,830 --> 00:47:47,070
So here is my laptop again.

952
00:47:47,070 --> 00:47:51,480
And here is the cloud that's
between me and some service

953
00:47:51,480 --> 00:47:52,860
that I'm interested in.

954
00:47:52,860 --> 00:47:55,695
We'll assume for now that there
is some kind of load balancing.

955
00:47:55,695 --> 00:47:57,570
And I'm just going to
draw it a little bigger

956
00:47:57,570 --> 00:48:00,957
this time to suggest that-- let's
just think of it now as a black box.

957
00:48:00,957 --> 00:48:02,040
And maybe it's one server.

958
00:48:02,040 --> 00:48:02,970
Maybe it's two.

959
00:48:02,970 --> 00:48:03,930
Maybe it's more.

960
00:48:03,930 --> 00:48:07,200
But somehow or other, load
balancing is implemented.

961
00:48:07,200 --> 00:48:09,990
Then I'm going to have
all of my servers here,

962
00:48:09,990 --> 00:48:14,790
which we'll abstract away as maybe
three or more at this point-- one, two,

963
00:48:14,790 --> 00:48:16,770
and then we'll call this n.

964
00:48:16,770 --> 00:48:20,200
But a web server typically does
not do everything these days.

965
00:48:20,200 --> 00:48:22,530
In fact, it's been
trending for some time

966
00:48:22,530 --> 00:48:25,440
to actually have different servers
or different virtual machines,

967
00:48:25,440 --> 00:48:27,960
or even more recently,
different containers.

968
00:48:27,960 --> 00:48:30,180
Each provide individual services.

969
00:48:30,180 --> 00:48:32,440
Sometimes people call
these micro services

970
00:48:32,440 --> 00:48:36,030
if a container only does one, and
one very narrowly defined thing,

971
00:48:36,030 --> 00:48:39,540
like send emails, or save
information to a database,

972
00:48:39,540 --> 00:48:42,610
or respond to HTTP requests.

973
00:48:42,610 --> 00:48:46,830
So these back end web servers are not
the only types of servers we have.

974
00:48:46,830 --> 00:48:49,780
Odds are we at least have one database.

975
00:48:49,780 --> 00:48:53,160
So let's consider now
the implication of all

976
00:48:53,160 --> 00:48:56,400
of these architectural
decisions we've made thus far

977
00:48:56,400 --> 00:48:59,400
on how we actually store our data.

978
00:48:59,400 --> 00:49:03,780
So in simplest form, our
database might look like this.

979
00:49:03,780 --> 00:49:06,510
And for historical reasons, it's
generally drawn as a cylinder.

980
00:49:06,510 --> 00:49:08,580
And this is our database.

981
00:49:08,580 --> 00:49:12,210
Now, it's immediately
obvious that if all servers--

982
00:49:12,210 --> 00:49:16,620
1, 2, dot, dot, dot, n-- need to
save information or read information

983
00:49:16,620 --> 00:49:20,800
from a database, they've all got to
somehow communicate with that database

984
00:49:20,800 --> 00:49:25,300
so they all have some kind of
connectivity, physically or otherwise.

985
00:49:25,300 --> 00:49:29,850
So this seems fine so long as the
software that's running on servers 1,

986
00:49:29,850 --> 00:49:32,680
2, dot, dot, dot, and no matter
what language we're using,

987
00:49:32,680 --> 00:49:36,330
whether it's Java or Python or
PHP or C# or something else--

988
00:49:36,330 --> 00:49:40,740
so long as those servers can talk
to, via the network, this database,

989
00:49:40,740 --> 00:49:41,880
that's great.

990
00:49:41,880 --> 00:49:44,062
They can all save their
data to the same place,

991
00:49:44,062 --> 00:49:46,270
and they can all read their
data from the same place.

992
00:49:46,270 --> 00:49:48,360
So everything stays nicely in sync.

993
00:49:48,360 --> 00:49:51,030
But what's the first problem
that motivated the entirety

994
00:49:51,030 --> 00:49:54,680
of this discussion from the outset?

995
00:49:54,680 --> 00:49:59,440
Well, what if one database
isn't really enough?

996
00:49:59,440 --> 00:50:03,310
Well, we could take the
approach of vertically scaling

997
00:50:03,310 --> 00:50:07,210
our architecture, which is another
piece of jargon in this space.

998
00:50:07,210 --> 00:50:14,530
So vertical scaling means if your
one database isn't quite up to snuff,

999
00:50:14,530 --> 00:50:18,670
and you're running low on disk
space or capacity because of numbers

1000
00:50:18,670 --> 00:50:22,360
of requests per second are, of course,
limited, you know what you can do?

1001
00:50:22,360 --> 00:50:30,310
You can go ahead and disconnect this one
and go ahead and put in a bigger one,

1002
00:50:30,310 --> 00:50:33,130
and therefore increase your capacity.

1003
00:50:33,130 --> 00:50:36,640
And vertical scaling means
to really pay more money

1004
00:50:36,640 --> 00:50:39,640
or get something higher
end, a higher, more premium

1005
00:50:39,640 --> 00:50:42,730
model, a more expensive model that's
got more disk space and more RAM

1006
00:50:42,730 --> 00:50:44,750
and a faster CPU or more CPUs.

1007
00:50:44,750 --> 00:50:46,720
So you just throw
hardware at the problem--

1008
00:50:46,720 --> 00:50:50,680
not in the sense of multiple servers,
but just one bigger and better server.

1009
00:50:50,680 --> 00:50:52,117
But what are the challenges here?

1010
00:50:52,117 --> 00:50:53,950
Well, if you've ever
bought a home computer,

1011
00:50:53,950 --> 00:50:57,220
odds are whether it's been on Dell's
site or Microsoft's or Apple's

1012
00:50:57,220 --> 00:51:00,310
or the like, you often have
this good, better, best thing

1013
00:51:00,310 --> 00:51:04,090
where, for the top of the
line laptop or desktop,

1014
00:51:04,090 --> 00:51:06,640
you're going to be
paying through the roof--

1015
00:51:06,640 --> 00:51:08,620
through the nose, so to speak.

1016
00:51:08,620 --> 00:51:11,470
You're going to be paying a premium
for that top of the line model.

1017
00:51:11,470 --> 00:51:14,178
But you might actually be able to
save a decent number of dollars

1018
00:51:14,178 --> 00:51:17,470
by going for the second
best or the third best,

1019
00:51:17,470 --> 00:51:21,340
because the marginal gains
of each additional dollar

1020
00:51:21,340 --> 00:51:22,695
really aren't all that much.

1021
00:51:22,695 --> 00:51:24,820
Because for marketing
reasons, they know that there

1022
00:51:24,820 --> 00:51:26,778
might be some people out
there that will always

1023
00:51:26,778 --> 00:51:28,330
pay top dollar for the fastest one.

1024
00:51:28,330 --> 00:51:30,163
But just because you're
paying twice as much

1025
00:51:30,163 --> 00:51:33,650
doesn't mean the laptops is going
to be twice as good, for instance.

1026
00:51:33,650 --> 00:51:37,120
So this is to say to vertically
scale your database, you might end up

1027
00:51:37,120 --> 00:51:40,810
paying, through the nose, some
very expensive hardware just

1028
00:51:40,810 --> 00:51:43,820
to eke out some more performance.

1029
00:51:43,820 --> 00:51:45,610
But that's not even the biggest problem.

1030
00:51:45,610 --> 00:51:48,350
The most fundamental problem
is at the end of the day,

1031
00:51:48,350 --> 00:51:53,050
there is a top-of-the-line server for
your database that only can support

1032
00:51:53,050 --> 00:51:56,020
a finite number of database
connections at a time,

1033
00:51:56,020 --> 00:51:58,480
or a finite number of reads
or writes, so to speak,

1034
00:51:58,480 --> 00:52:00,357
saving and reading from the database.

1035
00:52:00,357 --> 00:52:03,190
So at some point or other, it doesn't
matter how much money you have

1036
00:52:03,190 --> 00:52:05,530
or how willing you are to
throw hardware at the problem.

1037
00:52:05,530 --> 00:52:10,070
There exists no server that can handle
more users than you currently have.

1038
00:52:10,070 --> 00:52:14,320
So at some point, you actually
have to put away your wallet

1039
00:52:14,320 --> 00:52:17,680
and put back on the engineering
hat alone and figure out

1040
00:52:17,680 --> 00:52:24,220
how to not vertically scale, but
horizontally scale your architecture.

1041
00:52:24,220 --> 00:52:29,860
And by this, I mean actually introducing
not just one big, fancy server,

1042
00:52:29,860 --> 00:52:33,112
but two or more maybe
smaller, cheaper servers.

1043
00:52:33,112 --> 00:52:35,320
In fact, one of the things
that companies like Google

1044
00:52:35,320 --> 00:52:37,810
were especially good
at early on was using

1045
00:52:37,810 --> 00:52:42,910
off-the-shelf, inexpensive hardware and
building supercomputers out of them,

1046
00:52:42,910 --> 00:52:44,770
but much more economically
than they might

1047
00:52:44,770 --> 00:52:46,450
have had they gone top
of the line everywhere,

1048
00:52:46,450 --> 00:52:48,200
even though that would
mean fewer servers.

1049
00:52:48,200 --> 00:52:50,696
Better to get more cheaper
servers and somehow

1050
00:52:50,696 --> 00:52:53,320
figure out how to interconnect
them and write the software that

1051
00:52:53,320 --> 00:52:56,410
lets them all be useful
simultaneously so that we can instead

1052
00:52:56,410 --> 00:52:59,620
have a picture that looks a
bit more like this, with maybe

1053
00:52:59,620 --> 00:53:03,310
a pair of databases in the picture now.

1054
00:53:03,310 --> 00:53:05,650
Of course, we've now
created that same problem

1055
00:53:05,650 --> 00:53:09,130
that we had earlier about
where does the data go.

1056
00:53:09,130 --> 00:53:10,990
Where does the traffic
or the users flow,

1057
00:53:10,990 --> 00:53:14,960
especially now where we have one
on the left and one on the right?

1058
00:53:14,960 --> 00:53:18,460
So there's a couple of solutions here,
but there are some different problems

1059
00:53:18,460 --> 00:53:19,900
that arise with databases.

1060
00:53:19,900 --> 00:53:27,600
If we very simply put a load balancer in
here, LB, and route traffic uniformly--

1061
00:53:27,600 --> 00:53:30,010
say, to the left or to the right--

1062
00:53:30,010 --> 00:53:32,180
that's probably not the best thing.

1063
00:53:32,180 --> 00:53:34,960
Because then you're going to
end up with a world where you're

1064
00:53:34,960 --> 00:53:39,460
saving some data for a user
here and some data for a user

1065
00:53:39,460 --> 00:53:42,757
here just by chance, because you're
using round robin, so to speak,

1066
00:53:42,757 --> 00:53:45,340
or just some probabilistic
heuristic where some of the traffic

1067
00:53:45,340 --> 00:53:47,140
goes this way, some of
the traffic goes that way.

1068
00:53:47,140 --> 00:53:48,160
And that's not so good.

1069
00:53:48,160 --> 00:53:48,670
OK.

1070
00:53:48,670 --> 00:53:54,820
But we could solve that by somehow
making sure that if this user, User A,

1071
00:53:54,820 --> 00:54:00,400
visits my web site, I should always
send him or her to the same database.

1072
00:54:00,400 --> 00:54:02,440
And you can do this in a couple of ways.

1073
00:54:02,440 --> 00:54:04,510
You can enforce some
notion of stickiness,

1074
00:54:04,510 --> 00:54:07,180
so to speak, whereby you
somehow notice that, oh, this is

1075
00:54:07,180 --> 00:54:09,010
User A. We've seen him or her before.

1076
00:54:09,010 --> 00:54:12,130
Let's make sure we send him
to this database on the left

1077
00:54:12,130 --> 00:54:14,020
and not the one on the right.

1078
00:54:14,020 --> 00:54:18,070
Or you can more formally use
a process known as sharding.

1079
00:54:18,070 --> 00:54:20,380
In fact, this is very common
early on in databases,

1080
00:54:20,380 --> 00:54:24,010
and even in websites like Facebook,
where you have so many users

1081
00:54:24,010 --> 00:54:26,830
that you need to start splitting
them across multiple databases.

1082
00:54:26,830 --> 00:54:28,360
But gosh, how to do that?

1083
00:54:28,360 --> 00:54:31,300
Back in the earliest days of
Facebook, what they might have done

1084
00:54:31,300 --> 00:54:35,275
was put all Harvard users on one
database, all MIT users on another,

1085
00:54:35,275 --> 00:54:37,600
all BU users on another, and so forth.

1086
00:54:37,600 --> 00:54:40,210
Because Facebook, as you may
recall, started scaling out

1087
00:54:40,210 --> 00:54:41,620
initially to disparate schools.

1088
00:54:41,620 --> 00:54:44,410
That was a wonderful
opportunity to shard

1089
00:54:44,410 --> 00:54:49,930
their data by putting similar users
in their respective databases.

1090
00:54:49,930 --> 00:54:51,700
And at the time, I
think you couldn't even

1091
00:54:51,700 --> 00:54:54,240
be friends with people in other
schools, at least very early

1092
00:54:54,240 --> 00:54:58,050
on, because those databases,
presumably, were independent,

1093
00:54:58,050 --> 00:55:01,440
or certainly could
have been topologicaly.

1094
00:55:01,440 --> 00:55:04,530
Or you might do something more
simple that doesn't create

1095
00:55:04,530 --> 00:55:06,420
some problems like isolation there.

1096
00:55:06,420 --> 00:55:10,320
Maybe all of your users whose last
name start with A go on one server,

1097
00:55:10,320 --> 00:55:12,400
and all of your users
whose names start with B

1098
00:55:12,400 --> 00:55:14,170
go on another server, and so forth.

1099
00:55:14,170 --> 00:55:18,330
So you can almost hash your users, to
borrow a terminology from hash tables,

1100
00:55:18,330 --> 00:55:20,970
and decide where to put that data.

1101
00:55:20,970 --> 00:55:24,690
Of course, that does not help
with backups or redundancy.

1102
00:55:24,690 --> 00:55:28,470
Because if you're putting all of your
A names here and all of your B names

1103
00:55:28,470 --> 00:55:31,230
here, what happens, god forbid,
if one of the servers goes down?

1104
00:55:31,230 --> 00:55:33,600
You've lost half of your customers.

1105
00:55:33,600 --> 00:55:36,390
So it would seem that no matter
how you balance the load,

1106
00:55:36,390 --> 00:55:39,850
you really want to maintain
duplicates of data.

1107
00:55:39,850 --> 00:55:42,870
And so there's a few different
ways people solve this.

1108
00:55:42,870 --> 00:55:45,510
In fact, let me go
ahead and temporarily go

1109
00:55:45,510 --> 00:55:50,940
back to that first model, where we
had a really fancy, bigger database

1110
00:55:50,940 --> 00:55:53,670
that I'll deliberately
draw as pretty big.

1111
00:55:53,670 --> 00:55:57,450
And this is big in the sense that
it can respond to requests quickly

1112
00:55:57,450 --> 00:55:59,280
and it can store a lot of data.

1113
00:55:59,280 --> 00:56:03,630
This might be generally called our
primary or our master database.

1114
00:56:03,630 --> 00:56:06,420
And it's where our data
goes to live long term.

1115
00:56:06,420 --> 00:56:09,900
It's where data is written to, so to
speak, and could also be read from.

1116
00:56:09,900 --> 00:56:13,380
But if we're going to bump up
against some limit of how much work

1117
00:56:13,380 --> 00:56:15,450
this database can do
at once, it would be

1118
00:56:15,450 --> 00:56:18,960
nice to have some secondary
servers or tertiary servers.

1119
00:56:18,960 --> 00:56:24,240
So a very common paradigm would be to
use this primary database for writes--

1120
00:56:24,240 --> 00:56:25,860
we'll abbreviate it w--

1121
00:56:25,860 --> 00:56:29,790
and then also have maybe a couple
of smaller databases, or even

1122
00:56:29,790 --> 00:56:35,610
the same size databases, that are
meant for reads, abbreviated R.

1123
00:56:35,610 --> 00:56:39,600
And so long as these databases are
somehow talking to one another,

1124
00:56:39,600 --> 00:56:41,400
this topology will just work.

1125
00:56:41,400 --> 00:56:43,410
This is a feature known as replication.

1126
00:56:43,410 --> 00:56:46,650
So long as the databases
are configured in such a way

1127
00:56:46,650 --> 00:56:50,310
that any time data is written to
the primary database or the master

1128
00:56:50,310 --> 00:56:55,440
database, that data gets replicated
to any replicas, as they're called.

1129
00:56:55,440 --> 00:57:02,539
Meanwhile, servers 1, 2, and n should
also be able to talk to these replicas.

1130
00:57:02,539 --> 00:57:05,580
And if your code is smart enough--
and you would have to think about this

1131
00:57:05,580 --> 00:57:10,260
and design this into your codebase--
you could ensure that any time you

1132
00:57:10,260 --> 00:57:15,270
read data from a database, it comes from
one, or really any, of your replicas,

1133
00:57:15,270 --> 00:57:18,240
replicas in the sense that they
are meant to have duplicate data.

1134
00:57:18,240 --> 00:57:22,170
But anytime you write data-- a
SQL INSERT or UPDATE or DELETE,

1135
00:57:22,170 --> 00:57:24,150
as opposed to a SQL SELECT--

1136
00:57:24,150 --> 00:57:28,710
you only send your write operations
to the primary or master database

1137
00:57:28,710 --> 00:57:32,026
and leave it to it to then
replicate it to the read replicas.

1138
00:57:32,026 --> 00:57:33,900
Now, of course, there
are some problems here.

1139
00:57:33,900 --> 00:57:35,070
There's some latency, potentially.

1140
00:57:35,070 --> 00:57:36,320
Maybe it takes a split second.

1141
00:57:36,320 --> 00:57:39,190
Maybe it takes a couple seconds
for that data to replicate.

1142
00:57:39,190 --> 00:57:43,440
So things might not appear to
be updated instantaneously.

1143
00:57:43,440 --> 00:57:48,900
But you have now a very scalable model
in that if you have the money to spend,

1144
00:57:48,900 --> 00:57:54,000
you can even have more read replicas and
have even more and more read capacity.

1145
00:57:54,000 --> 00:57:58,080
Of course, you're going to eventually
bump up against a limit on your rights,

1146
00:57:58,080 --> 00:58:00,840
at which point we need to
introduce another solution.

1147
00:58:00,840 --> 00:58:03,690
But again, this is a very
incremental approach.

1148
00:58:03,690 --> 00:58:06,890
And we can throw a little bit of
money at the problem each time

1149
00:58:06,890 --> 00:58:09,090
and a little bit of
engineering wherewithal

1150
00:58:09,090 --> 00:58:11,491
in order to at least get us
over that next ledge, which

1151
00:58:11,491 --> 00:58:14,490
is super important, certainly, when
you're first building your business.

1152
00:58:14,490 --> 00:58:17,582
If You don't necessarily have the
resources to go all in on things,

1153
00:58:17,582 --> 00:58:19,290
you at least want to
get over this hurdle

1154
00:58:19,290 --> 00:58:24,340
or at least build in some capacity
for the next load of users.

1155
00:58:24,340 --> 00:58:27,300
So what if we run out
of capacity, though,

1156
00:58:27,300 --> 00:58:31,090
with that that writable server,
the master database, so to speak?

1157
00:58:31,090 --> 00:58:33,270
We need to be a little more clever.

1158
00:58:33,270 --> 00:58:37,860
And it turns out we can borrow this
idea of these horizontal arrows

1159
00:58:37,860 --> 00:58:43,110
here to replicate our data, but
for a slightly different purpose.

1160
00:58:43,110 --> 00:58:47,400
We could still have a pretty
souped up writable database.

1161
00:58:47,400 --> 00:58:51,750
But we could have another one, maybe
identical in its specs, writable.

1162
00:58:51,750 --> 00:58:54,960
But somehow, these things need to be
able to synchronize with themselves.

1163
00:58:54,960 --> 00:58:57,920
And maybe there's still some
read replicas over here--

1164
00:58:57,920 --> 00:59:01,200
R for read, and another
one over here, R for read.

1165
00:59:01,200 --> 00:59:03,750
And these are all somehow
interconnected as well.

1166
00:59:03,750 --> 00:59:07,860
But you can have what's called
master master replication, whereby

1167
00:59:07,860 --> 00:59:12,144
your server's code writes
to one of these servers.

1168
00:59:12,144 --> 00:59:13,560
And maybe it's either of them now.

1169
00:59:13,560 --> 00:59:15,840
Maybe the load balancer actually
does send some of the writes

1170
00:59:15,840 --> 00:59:17,423
this way, some of the writes this way.

1171
00:59:17,423 --> 00:59:20,190
But the master database,
the writable ones now,

1172
00:59:20,190 --> 00:59:24,534
are configured, in software, to
replicate horizontally, so to speak.

1173
00:59:24,534 --> 00:59:26,700
So here too, you might have
a little bit of latency.

1174
00:59:26,700 --> 00:59:28,491
It might take a few
milliseconds or seconds

1175
00:59:28,491 --> 00:59:30,060
for the data to actually replicate.

1176
00:59:30,060 --> 00:59:34,800
But at least now we've doubled
the capacity for our writes

1177
00:59:34,800 --> 00:59:38,560
so as to handle twice as
many writable operations.

1178
00:59:38,560 --> 00:59:41,790
And we can continue to hang more
and more read replicas off of these

1179
00:59:41,790 --> 00:59:46,740
if you want in order to
handle more and more users.

1180
00:59:46,740 --> 00:59:50,950
And so this is the challenge and,
dare say, the fun of engineering

1181
00:59:50,950 --> 00:59:54,192
architecturally-- understanding
some of these basic building blocks.

1182
00:59:54,192 --> 00:59:56,650
And even if you might not know
the particular manufacturers

1183
00:59:56,650 --> 00:59:59,140
or how you physically
configure the servers,

1184
00:59:59,140 --> 01:00:02,300
or how in software you configure
these servers, at the end of the day,

1185
01:00:02,300 --> 01:00:05,950
these really are just puzzle pieces
that can somehow be interlocked.

1186
01:00:05,950 --> 01:00:09,280
And these puzzle pieces can be used
to solve more and more interesting

1187
01:00:09,280 --> 01:00:10,090
problems.

1188
01:00:10,090 --> 01:00:15,760
But to our discussion PaaS and Software
as a Service and Infrastructure

1189
01:00:15,760 --> 01:00:19,490
as a Service, there's also these
different layers of abstraction.

1190
01:00:19,490 --> 01:00:22,810
And so thematic throughout
this in all of our discussions

1191
01:00:22,810 --> 01:00:23,850
has been this layering.

1192
01:00:23,850 --> 01:00:27,802
Indeed, we started, really, down here
with those zeros and ones and bits,

1193
01:00:27,802 --> 01:00:30,010
and very quickly went to
Ascii, and very quickly went

1194
01:00:30,010 --> 01:00:33,010
to colors and images
and videos and so forth.

1195
01:00:33,010 --> 01:00:36,042
Because once you understand some of
those ingredients or puzzle pieces,

1196
01:00:36,042 --> 01:00:37,750
can you build something
more interesting?

1197
01:00:37,750 --> 01:00:39,790
And then can you slap a name on it--

1198
01:00:39,790 --> 01:00:43,392
sometimes cryptic, like
IaaS, or PaaS, or SaaS?

1199
01:00:43,392 --> 01:00:45,100
But at the end of the
day, those are just

1200
01:00:45,100 --> 01:00:48,580
labels that describe, really,
black boxes, inside of which

1201
01:00:48,580 --> 01:00:52,360
is a decent amount of complexity,
a clever amount of engineering,

1202
01:00:52,360 --> 01:00:55,270
but ultimately, a solution to a problem.

1203
01:00:55,270 --> 01:00:58,810
And so in cloud computing, do we really
have this catch-all phrase that's

1204
01:00:58,810 --> 01:01:03,430
referring to a whole class of solutions
to problems that ultimately are all

1205
01:01:03,430 --> 01:01:07,090
about getting one's business or
getting one's personal website

1206
01:01:07,090 --> 01:01:11,140
out on the internet for users to
access, whether via laptops or desktops

1207
01:01:11,140 --> 01:01:12,820
or mobile devices and more?

1208
01:01:12,820 --> 01:01:14,710
So at the end of the
day, what is the cloud?

1209
01:01:14,710 --> 01:01:16,240
It's this evolving definition.

1210
01:01:16,240 --> 01:01:21,230
It's this evolving class of services
that just continues to grow.

1211
01:01:21,230 --> 01:01:23,840
But each of those services
is solving a problem.

1212
01:01:23,840 --> 01:01:30,880
Each of those problems derives from
plugging one hole in a leaky hose,

1213
01:01:30,880 --> 01:01:33,580
seeing another one spring up,
and then addressing that one,

1214
01:01:33,580 --> 01:01:36,430
and then layering on top of those
solutions these are abstractions,

1215
01:01:36,430 --> 01:01:39,700
and ultimately some marketing
speak, like cloud computing itself,

1216
01:01:39,700 --> 01:01:43,120
so that you can build, out of these
more sophisticated puzzle pieces,

1217
01:01:43,120 --> 01:01:45,640
bigger and better solutions
to actual problems

1218
01:01:45,640 --> 01:01:49,860
you have when you're trying
to build your own site.

1219
01:01:49,860 --> 01:01:50,896