WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

00:00:10.360 --> 00:00:12.340
Cloud computing-- it's
this term that rather

00:00:12.340 --> 00:00:14.330
swept onto the scene in recent years.

00:00:14.330 --> 00:00:17.230
And it sounds like it's some
new and trendy technology.

00:00:17.230 --> 00:00:19.769
But in reality, it's really
just a very nice packaging

00:00:19.769 --> 00:00:22.060
up of a whole number of
technologies that have actually

00:00:22.060 --> 00:00:23.620
been with us for some time.

00:00:23.620 --> 00:00:26.710
In fact, cloud computing,
in its simplest form,

00:00:26.710 --> 00:00:29.800
can really be thought
of as just outsourcing

00:00:29.800 --> 00:00:33.250
the hosting of your applications
and really outsourcing

00:00:33.250 --> 00:00:36.790
the hosting of your physical servers
to someone else-- put another way,

00:00:36.790 --> 00:00:41.140
renting space and renting time
on someone else's computers.

00:00:41.140 --> 00:00:45.520
But these days, we just have so much
computational capabilities-- that is,

00:00:45.520 --> 00:00:50.650
our computers are so fast, our CPUs
are so many, and we have so much RAM--

00:00:50.650 --> 00:00:53.500
that new and fancier
technologies have lent themselves

00:00:53.500 --> 00:00:56.590
to this trend of hosting
all the more software

00:00:56.590 --> 00:01:00.520
and putting all of the more hardware
off-site in the so-called cloud

00:01:00.520 --> 00:01:05.170
so that companies, both big
and small, no longer need

00:01:05.170 --> 00:01:09.010
to host their own physical hardware
or even a whole number of roles

00:01:09.010 --> 00:01:10.390
in their own local companies.

00:01:10.390 --> 00:01:13.120
And so what we'll do now is
dive into cloud computing,

00:01:13.120 --> 00:01:15.280
look at some of the
problems it solves, look

00:01:15.280 --> 00:01:18.190
at some of the opportunities
it affords, but ultimately,

00:01:18.190 --> 00:01:20.890
take a look from the ground up
at what's underneath the hood

00:01:20.890 --> 00:01:23.080
here so that by the end
of this, we have a better

00:01:23.080 --> 00:01:25.900
understanding of what the
cloud is, why it is useful,

00:01:25.900 --> 00:01:28.030
and what it actually is not.

00:01:28.030 --> 00:01:31.490
So with that said, let's
start with a simple scenario.

00:01:31.490 --> 00:01:34.990
Of course, the cloud
perhaps derives its origins

00:01:34.990 --> 00:01:37.630
from how the internet,
for some time, was drawn,

00:01:37.630 --> 00:01:40.510
which was just this big, nebulous
cloud, in that it doesn't really

00:01:40.510 --> 00:01:41.950
matter what's inside that cloud.

00:01:41.950 --> 00:01:46.330
Although at this point, you most surely
appreciate that inside of this cloud

00:01:46.330 --> 00:01:49.450
are things like routers, and
running through those routers

00:01:49.450 --> 00:01:51.937
are packets, both TCP/IP and the like.

00:01:51.937 --> 00:01:53.770
And underneath the hood,
then, of this cloud

00:01:53.770 --> 00:01:57.520
is some transport mechanism that
gets data from point A to point B.

00:01:57.520 --> 00:02:00.550
So what might those point
A's and Point B's be?

00:02:00.550 --> 00:02:04.180
Well, if this here is my little,
old laptop, connected somehow

00:02:04.180 --> 00:02:07.510
to the internet here,
and maybe down here there

00:02:07.510 --> 00:02:11.500
is some web server on which lives
a whole bunch of web pages--

00:02:11.500 --> 00:02:12.550
maybe it's my email.

00:02:12.550 --> 00:02:13.840
Maybe it's the day's news.

00:02:13.840 --> 00:02:16.240
Maybe it's some social
media site or the like.

00:02:16.240 --> 00:02:21.190
I, at point A, want to somehow
connect to point B down here.

00:02:21.190 --> 00:02:24.790
Now, it turns out it's not all
that hard to get a website up

00:02:24.790 --> 00:02:26.230
and running on the internet.

00:02:26.230 --> 00:02:28.540
You can, of course, use
any number of languages.

00:02:28.540 --> 00:02:30.760
You can use any number of databases.

00:02:30.760 --> 00:02:34.270
And you can do it with
relatively little experience,

00:02:34.270 --> 00:02:36.320
just getting something on the internet.

00:02:36.320 --> 00:02:39.250
In fact, it's not all that
hard, relatively speaking,

00:02:39.250 --> 00:02:41.740
to get a prototype of
your application or even

00:02:41.740 --> 00:02:44.590
your first version of your
business up and running.

00:02:44.590 --> 00:02:50.200
But things start to get hard quickly,
especially if you have some success.

00:02:50.200 --> 00:02:53.320
Indeed, a good problem to have is
that you have so many customers and so

00:02:53.320 --> 00:02:57.100
many users hitting your
websites that you can't actually

00:02:57.100 --> 00:02:58.450
handle all of the load.

00:02:58.450 --> 00:03:01.365
Now, it's a good problem in the
sense that business is booming.

00:03:01.365 --> 00:03:03.490
But it's, of course, an
actual problem in the sense

00:03:03.490 --> 00:03:06.160
that your customers aren't going
to be able to visit your web site

00:03:06.160 --> 00:03:08.110
and buy whatever it is
you're selling or read

00:03:08.110 --> 00:03:12.520
whatever it is you're posting if your
servers can't actually handle the load.

00:03:12.520 --> 00:03:16.780
And by load, I simply mean the number
of users per minute or per unit of time

00:03:16.780 --> 00:03:19.390
that your website is
actually experiencing.

00:03:19.390 --> 00:03:21.670
And its capacity, meanwhile,
would be the number

00:03:21.670 --> 00:03:23.500
of users it can actually support.

00:03:23.500 --> 00:03:25.760
Now, why are there these
limits in the first place?

00:03:25.760 --> 00:03:28.030
Well, you may recall
that inside of a computer

00:03:28.030 --> 00:03:30.640
is a CPU, the brains of that computer.

00:03:30.640 --> 00:03:33.580
And inside of a computer
is some memory, like RAM.

00:03:33.580 --> 00:03:36.940
And there might be some longer-term
storage, like hard disk space.

00:03:36.940 --> 00:03:41.320
At the end of the day, all of those
resources and more are finite.

00:03:41.320 --> 00:03:44.290
You can only fit so much
physical hardware in a computer.

00:03:44.290 --> 00:03:47.500
Humans have only been able
to pack so many resources

00:03:47.500 --> 00:03:49.894
into the physical space of a computer.

00:03:49.894 --> 00:03:51.310
And then, of course, there's cost.

00:03:51.310 --> 00:03:54.920
You might be able to only afford
so much computing capacity.

00:03:54.920 --> 00:03:58.800
So if a computer can only do
some number of things per second,

00:03:58.800 --> 00:04:02.154
there is surely an upper bound on
how many people can visit your web

00:04:02.154 --> 00:04:05.320
site, how many people can add things
to their shopping cart, how many people

00:04:05.320 --> 00:04:07.330
can check out with their credit card.

00:04:07.330 --> 00:04:10.940
Because you only have, at the end of
the day, a finite numbers of resources.

00:04:10.940 --> 00:04:13.060
Now, what does that mean in real terms?

00:04:13.060 --> 00:04:16.540
Well, maybe your web server can
handle 100 users per minute.

00:04:16.540 --> 00:04:18.700
Maybe it can handle
1,000 users per minute.

00:04:18.700 --> 00:04:22.240
Maybe it can handle 1,000 users per
second, or even much more than that.

00:04:22.240 --> 00:04:26.887
It really depends on the specifications
of your hardware-- how much RAM,

00:04:26.887 --> 00:04:29.470
how much CPU and so forth that
you actually have-- and it also

00:04:29.470 --> 00:04:33.430
depends, to some extent, on how
well-written your code is and how fast

00:04:33.430 --> 00:04:37.280
or how slow your code, your
software actually runs.

00:04:37.280 --> 00:04:39.700
So these are knobs that
can ultimately be turned.

00:04:39.700 --> 00:04:42.310
And through testing, can you
figure this out in advance

00:04:42.310 --> 00:04:46.470
by simulating traffic in order to
estimate exactly how many users you

00:04:46.470 --> 00:04:49.000
might be able to handle at a time?

00:04:49.000 --> 00:04:53.200
Now, the relevance to today is
that the cloud, so to speak,

00:04:53.200 --> 00:04:56.200
allows us to start to solve
some of these problems

00:04:56.200 --> 00:04:59.890
and also allows us to start
abstracting away the solutions to some

00:04:59.890 --> 00:05:00.640
of these problems.

00:05:00.640 --> 00:05:02.360
Well, let's see what
this actually means.

00:05:02.360 --> 00:05:04.449
So at some point or other--

00:05:04.449 --> 00:05:06.490
especially when it's not
just my laptop, but it's

00:05:06.490 --> 00:05:10.160
like 1,000 laptops, or 10,000 laptops
and desktops and phones and more

00:05:10.160 --> 00:05:12.860
that are somehow trying
to access my server here--

00:05:12.860 --> 00:05:17.000
at some point, we hit that upper
limit whereby no more users can

00:05:17.000 --> 00:05:19.200
fit onto my web site per unit of time.

00:05:19.200 --> 00:05:22.070
So what is the symptom that my
users experience at that point

00:05:22.070 --> 00:05:23.660
if I'm over capacity?

00:05:23.660 --> 00:05:26.210
Well, they might see an
error message of some sort.

00:05:26.210 --> 00:05:28.550
They might just
experience a spinning icon

00:05:28.550 --> 00:05:30.662
because the website is
super slow to respond.

00:05:30.662 --> 00:05:33.120
And maybe it does respond, but
maybe it's 10 seconds later.

00:05:33.120 --> 00:05:36.890
So at the end of the day, they either
have a bad experience or no experience

00:05:36.890 --> 00:05:42.480
whatsoever, because my server can only
handle so many requests at a time.

00:05:42.480 --> 00:05:44.570
So what do you do to solve this problem?

00:05:44.570 --> 00:05:48.500
If one server is not enough, maybe
the most intuitive solution is, well,

00:05:48.500 --> 00:05:51.500
if one server is not
giving me enough headroom,

00:05:51.500 --> 00:05:53.490
why don't I just have two servers?

00:05:53.490 --> 00:05:54.890
So let's go ahead and do that.

00:05:54.890 --> 00:05:58.370
Instead of having just one server,
let's go ahead and have two.

00:05:58.370 --> 00:06:01.790
And let me propose that on the second
server, it's the exact same software.

00:06:01.790 --> 00:06:05.150
So whatever code I've written, in
whatever language it's written,

00:06:05.150 --> 00:06:08.600
I just have copies of my web
site on both the original server

00:06:08.600 --> 00:06:10.520
and the second server.

00:06:10.520 --> 00:06:14.270
Now I've solved the
problem in the simple sense

00:06:14.270 --> 00:06:16.010
that I've doubled my capacity.

00:06:16.010 --> 00:06:18.570
If one server can handle
1,000 people per second,

00:06:18.570 --> 00:06:21.714
well, then surely two servers can
handle 2,000 people per second,

00:06:21.714 --> 00:06:22.880
so I've doubled my capacity.

00:06:22.880 --> 00:06:23.870
So that's good.

00:06:23.870 --> 00:06:26.000
I've hopefully solved the problem.

00:06:26.000 --> 00:06:27.929
But it's not quite as simple as that.

00:06:27.929 --> 00:06:30.845
At least pictorially, I'm still
pointing at just one of those servers,

00:06:30.845 --> 00:06:33.890
so we're going to have to clean
up this picture alone and somehow

00:06:33.890 --> 00:06:36.680
figure out how to get users--

00:06:36.680 --> 00:06:38.450
or more generally, traffic--

00:06:38.450 --> 00:06:40.820
to both of these servers.

00:06:40.820 --> 00:06:43.740
I could just naively
draw an arrow like this.

00:06:43.740 --> 00:06:45.380
But what does that actually mean?

00:06:45.380 --> 00:06:47.780
We don't want to abstract
away so much of the detail

00:06:47.780 --> 00:06:50.460
that we're ignoring this problem.

00:06:50.460 --> 00:06:54.860
How do we implement this notion of
choosing between left arrow and right

00:06:54.860 --> 00:06:55.580
arrow?

00:06:55.580 --> 00:06:59.210
Well, let's consider what
our solutions might be.

00:06:59.210 --> 00:07:02.930
If a user, like me on my laptop,
is trying to visit this web site--

00:07:02.930 --> 00:07:06.740
and the web site, ideally, is going
to live at something like example.com,

00:07:06.740 --> 00:07:10.122
or facebook.com, or
gmail.com, or whatever--

00:07:10.122 --> 00:07:12.830
I don't want to have to broadcast
different names for my servers.

00:07:12.830 --> 00:07:14.910
And you might actually
notice this on the internet.

00:07:14.910 --> 00:07:18.160
You might notice, if you start noticing
the URLs of websites you're visiting--

00:07:18.160 --> 00:07:21.290
especially for certain older, stodgier
companies who haven't necessarily

00:07:21.290 --> 00:07:23.240
implemented this in
the most modern way--

00:07:23.240 --> 00:07:27.239
you might find yourself not
just at www.something.com,

00:07:27.239 --> 00:07:29.780
but if you look closely, you
might find yourself occasionally

00:07:29.780 --> 00:07:35.310
at www1.something.com,
www2.something.com,

00:07:35.310 --> 00:07:38.590
or even www13.something.com.

00:07:38.590 --> 00:07:43.670
Which is to say that some companies
appear to solve this problem by just

00:07:43.670 --> 00:07:45.140
giving different names--

00:07:45.140 --> 00:07:48.920
similar names, but different names--
to their two servers, three servers,

00:07:48.920 --> 00:07:51.200
13 servers, or however many they have.

00:07:51.200 --> 00:07:54.770
And then they somehow redirect
users from their main domain

00:07:54.770 --> 00:08:00.440
name, www.something.com, to any one
of those two or three or 13 servers.

00:08:00.440 --> 00:08:01.919
But this isn't very elegant.

00:08:01.919 --> 00:08:03.710
The marketing folks
would surely hate this,

00:08:03.710 --> 00:08:06.770
because you're trying to build some
brand recognition around your URL.

00:08:06.770 --> 00:08:10.370
Why would you dirty it by just putting
these arbitrary numbers in the URLs?

00:08:10.370 --> 00:08:13.940
Plus if you fast forward
a bit in this story,

00:08:13.940 --> 00:08:16.610
if, for some reason down
the road, you get fancier,

00:08:16.610 --> 00:08:18.382
bigger servers that
can handle more users,

00:08:18.382 --> 00:08:20.090
and therefore you
don't need 13 of them--

00:08:20.090 --> 00:08:22.220
you can get away with just six of them--

00:08:22.220 --> 00:08:24.950
well, what happens if some of
your customers have bookmarked,

00:08:24.950 --> 00:08:30.320
very reasonably, one of those older
names, like www13.something.com?

00:08:30.320 --> 00:08:33.799
So now when they try to visit that
URL, gosh, they might hit a dead end.

00:08:33.799 --> 00:08:35.632
So you could solve
that in some other way.

00:08:35.632 --> 00:08:38.090
But the point is it would seem
to create a problem quickly,

00:08:38.090 --> 00:08:40.159
and it's just a naming mess.

00:08:40.159 --> 00:08:44.270
Why actually bother having
your users see something

00:08:44.270 --> 00:08:46.020
as messy as these numbered servers?

00:08:46.020 --> 00:08:49.400
It would be nice to do this
a little more transparently.

00:08:49.400 --> 00:08:50.930
So how could we do this?

00:08:50.930 --> 00:08:54.140
Well, let me propose that we
kind of need some middleman here,

00:08:54.140 --> 00:08:57.740
so to speak, whereby traffic comes
from people like me on the internet

00:08:57.740 --> 00:09:00.400
and then either goes to the
left or goes to the right,

00:09:00.400 --> 00:09:02.780
or no matter how many
servers we have, goes

00:09:02.780 --> 00:09:05.750
to one of those actual web servers.

00:09:05.750 --> 00:09:09.500
So how does this middleman-- and
to borrow some past terminology,

00:09:09.500 --> 00:09:12.320
how does this black
box potentially work?

00:09:12.320 --> 00:09:14.030
Well, let's consider
some of the building

00:09:14.030 --> 00:09:18.530
blocks, some of the puzzle pieces we
have technologically at our disposal

00:09:18.530 --> 00:09:19.400
now.

00:09:19.400 --> 00:09:21.770
You may recall that every
server on the internet

00:09:21.770 --> 00:09:25.641
has an IP address, an internet protocol
address, a unique address for it.

00:09:25.641 --> 00:09:27.890
And that's, again, a bit of
a white lie, because there

00:09:27.890 --> 00:09:30.410
are technologies by which
you can have private IP

00:09:30.410 --> 00:09:32.690
addresses that the
outside world doesn't see.

00:09:32.690 --> 00:09:35.780
But let's stipulate,
for today's purposes,

00:09:35.780 --> 00:09:38.810
that every computer on the
internet certainly has an IP

00:09:38.810 --> 00:09:41.550
address, whether public or private.

00:09:41.550 --> 00:09:46.400
So maybe, just maybe, we could
leverage an existing technology--

00:09:46.400 --> 00:09:48.680
DNS, the Domain Name System--

00:09:48.680 --> 00:09:52.880
so that rather than only return
one IP address of a server

00:09:52.880 --> 00:09:57.470
when you look up www.something.com,
we return the IP address

00:09:57.470 --> 00:09:59.610
of the server on the
left some of the time

00:09:59.610 --> 00:10:02.990
or the IP address of the server
on the right some of the time,

00:10:02.990 --> 00:10:07.160
effectively balancing our load,
our traffic across the two servers.

00:10:07.160 --> 00:10:10.060
And in fact, if you
do this 50-50, you can

00:10:10.060 --> 00:10:12.640
take, really, what's called
a round robin approach,

00:10:12.640 --> 00:10:17.270
and ideally uniformly distribute
your traffic across multiple servers.

00:10:17.270 --> 00:10:20.140
And what's nice in this model is
that because you're using DNS,

00:10:20.140 --> 00:10:22.472
the user doesn't really
notice what's going on.

00:10:22.472 --> 00:10:24.430
At the end of the day,
none of us humans really

00:10:24.430 --> 00:10:27.130
care what IP address we're
actually going to if we visit

00:10:27.130 --> 00:10:29.410
Facebook.com or Gmail.com or the like.

00:10:29.410 --> 00:10:33.730
We just care that our computer can find
that server or servers on the internet.

00:10:33.730 --> 00:10:38.020
So via DNS, we could, very
cleverly, via this middleman here,

00:10:38.020 --> 00:10:42.010
which is really just going to be some
third device, some separate server--

00:10:42.010 --> 00:10:45.940
it, as a DNS device, could
just respond to requests

00:10:45.940 --> 00:10:49.630
from customers with either this
IP address or this IP address,

00:10:49.630 --> 00:10:52.840
or any number of different IP addresses.

00:10:52.840 --> 00:10:56.130
So does this solve the problem?

00:10:56.130 --> 00:10:58.020
Again, most everything
in computer science

00:10:58.020 --> 00:11:01.200
would seem to be a tradeoff
at the end of the day.

00:11:01.200 --> 00:11:03.690
And this seems almost too
good to be true, perhaps.

00:11:03.690 --> 00:11:04.560
It's so simple.

00:11:04.560 --> 00:11:06.270
It leverages an existing technology.

00:11:06.270 --> 00:11:07.750
It just works.

00:11:07.750 --> 00:11:10.440
So what prices might we pay?

00:11:10.440 --> 00:11:14.325
Well, DNS, it turns out,
gets cached quite a bit.

00:11:14.325 --> 00:11:15.450
And what does caching mean?

00:11:15.450 --> 00:11:18.626
Caching something means
keeping some past answer--

00:11:18.626 --> 00:11:20.750
or more generally, piece
of information-- around so

00:11:20.750 --> 00:11:26.350
that you can access it more quickly the
second and the third time and beyond.

00:11:26.350 --> 00:11:30.360
And so computers today,
Macs and PCs, as well as

00:11:30.360 --> 00:11:33.480
servers on the internet, other
DNS servers on the internet,

00:11:33.480 --> 00:11:37.170
for performance reasons, will
often remember the responses

00:11:37.170 --> 00:11:38.910
that they get from DNS servers.

00:11:38.910 --> 00:11:42.690
For instance, if, on my Mac, I
visit Facebook.com, hypothetically

00:11:42.690 --> 00:11:47.370
a lot of times during the day, it's kind
of stupid if my laptop, again and again

00:11:47.370 --> 00:11:49.620
and again and again,
asks some DNS server

00:11:49.620 --> 00:11:52.110
for Facebook.com's IP
address if it already

00:11:52.110 --> 00:11:55.020
asked that same question an hour
ago-- or more realistically,

00:11:55.020 --> 00:11:57.390
two minutes ago, or something like that.

00:11:57.390 --> 00:12:01.590
It would be smarter if my operating
system-- or even my browser, Chrome

00:12:01.590 --> 00:12:03.540
or Firefox or whatever
I'm using-- actually

00:12:03.540 --> 00:12:08.210
remembers that answer for me so that
my computer can just pull up that web

00:12:08.210 --> 00:12:13.740
site faster by skipping a step, by
not wasting time asking a server again

00:12:13.740 --> 00:12:15.750
for the IP address of a server.

00:12:15.750 --> 00:12:19.860
And after all, IP addresses, it turns
out, generally don't change that often.

00:12:19.860 --> 00:12:23.880
It's certainly possible for a company
or a university or even a home user

00:12:23.880 --> 00:12:25.840
to change their computer's IP addresses.

00:12:25.840 --> 00:12:28.350
But the reality is it doesn't
change all that often.

00:12:28.350 --> 00:12:30.490
The common case is to
have the same IP address

00:12:30.490 --> 00:12:33.960
now as you might an hour from now,
or even a day or a week or a month

00:12:33.960 --> 00:12:34.860
from now.

00:12:34.860 --> 00:12:37.530
But the key thing is that it can change.

00:12:37.530 --> 00:12:41.600
And especially if you're worried about
customers-- not just some personal web

00:12:41.600 --> 00:12:43.260
site, but you might lose business.

00:12:43.260 --> 00:12:45.960
You might lose orders if users
can't visit your website.

00:12:45.960 --> 00:12:49.470
Anything that puts your
server's uptime, so to speak--

00:12:49.470 --> 00:12:51.480
being accessible on
the internet at risk--

00:12:51.480 --> 00:12:53.790
probably is worthy of
some consideration.

00:12:53.790 --> 00:12:57.884
So let me propose, then, that just one
of these servers goes offline somehow.

00:12:57.884 --> 00:12:58.800
Maybe it's deliberate.

00:12:58.800 --> 00:13:00.330
You need to do some service for it.

00:13:00.330 --> 00:13:03.270
Or maybe it crashed in some way,
or it got unplugged somehow,

00:13:03.270 --> 00:13:07.530
or something went wrong such that
now, one or more of your servers,

00:13:07.530 --> 00:13:11.940
across which you've been load balancing,
no longer can talk to the internet.

00:13:11.940 --> 00:13:12.960
What might happen?

00:13:12.960 --> 00:13:15.510
Well, if some customer's
Mac, like my own,

00:13:15.510 --> 00:13:20.310
has remembered or cached that
particular server's IP address,

00:13:20.310 --> 00:13:21.870
that is not a good situation.

00:13:21.870 --> 00:13:24.150
Because your Mac or PC
or whatever is going

00:13:24.150 --> 00:13:27.450
to now try to revisit your
web site again and again

00:13:27.450 --> 00:13:33.660
and again at that old cached IP address
that apparently can be a dead end.

00:13:33.660 --> 00:13:38.370
And so even though you still have
servers that could potentially

00:13:38.370 --> 00:13:41.550
handle that customer's
request, that customer's order,

00:13:41.550 --> 00:13:44.520
that customer's desire
to check out, he or she

00:13:44.520 --> 00:13:46.650
really is still not
going to be able to visit

00:13:46.650 --> 00:13:49.290
the website unless that cache expires.

00:13:49.290 --> 00:13:52.230
Maybe they reboot their computer
so that the cache forcibly expires.

00:13:52.230 --> 00:13:55.140
Maybe they just wait some amount
of time so that that IP address

00:13:55.140 --> 00:13:57.510
is forgotten by the browser
or by the operating system

00:13:57.510 --> 00:14:01.950
or by some other DNS server
until the new one's available IP

00:14:01.950 --> 00:14:03.690
addresses are picked up instead.

00:14:03.690 --> 00:14:04.830
But there is that risk.

00:14:04.830 --> 00:14:07.470
And I would argue that this
risk is even higher especially

00:14:07.470 --> 00:14:11.910
for companies that might be considering
moving their infrastructure from one

00:14:11.910 --> 00:14:13.200
service to another.

00:14:13.200 --> 00:14:16.890
If you're deliberately going to move
your servers from one IP address

00:14:16.890 --> 00:14:20.010
to another, as might happen if you
change cloud providers, so to speak--

00:14:20.010 --> 00:14:21.240
more on those in a minute--

00:14:21.240 --> 00:14:24.790
really, if you change the companies
that you're using to host your servers,

00:14:24.790 --> 00:14:26.460
your IP addresses will change.

00:14:26.460 --> 00:14:29.310
And you certainly don't want to
incur a huge amount of downtime

00:14:29.310 --> 00:14:30.550
in a situation like that.

00:14:30.550 --> 00:14:32.130
So there are these tradeoffs.

00:14:32.130 --> 00:14:35.040
Easy solution, technologically
pretty inexpensive to do.

00:14:35.040 --> 00:14:37.840
It just works using existing technology.

00:14:37.840 --> 00:14:40.960
But you open up yourselves to this risk.

00:14:40.960 --> 00:14:42.300
So let's address that.

00:14:42.300 --> 00:14:44.837
Putting back the old
proverbial engineering hat,

00:14:44.837 --> 00:14:46.170
let's try to solve this problem.

00:14:46.170 --> 00:14:48.870
It seems that giving a unique
IP address to this server

00:14:48.870 --> 00:14:52.110
and to this server, and any number
of other servers that are back there,

00:14:52.110 --> 00:14:56.400
might not be the smartest idea in
so far as those IPs can get cached.

00:14:56.400 --> 00:15:01.120
So what if we use DNS as follows?

00:15:01.120 --> 00:15:05.220
When my laptop or anyone else's requests
the IP address for www.something.com,

00:15:05.220 --> 00:15:08.860
why don't we return the IP
address of this device here--

00:15:08.860 --> 00:15:11.680
this load balancer, as
we'll start calling it,

00:15:11.680 --> 00:15:15.360
where a load balancer is
usually just a physical device,

00:15:15.360 --> 00:15:18.840
or multiple physical devices, whose
purpose in life is to balance load?

00:15:18.840 --> 00:15:22.140
Packets come in, and similar
in spirit to a router,

00:15:22.140 --> 00:15:25.860
they do route information to the left,
to the right, or some other direction.

00:15:25.860 --> 00:15:29.910
But their overarching purpose isn't just
to get data from point A to point B,

00:15:29.910 --> 00:15:32.970
but to somehow intelligently
balance that traffic

00:15:32.970 --> 00:15:37.860
over multiple possible destinations
for point B, identical servers

00:15:37.860 --> 00:15:39.390
in the case of our story here.

00:15:39.390 --> 00:15:42.810
So what if, instead, we addressed
this problem of potential downtime

00:15:42.810 --> 00:15:46.830
by returning the IP address
of the load balancer,

00:15:46.830 --> 00:15:49.560
and then, by nature of
private IP addresses

00:15:49.560 --> 00:15:52.110
or some other mechanism
that the end user does not

00:15:52.110 --> 00:15:56.920
need to know or care about, this load
balancer somehow routes the traffic

00:15:56.920 --> 00:16:00.110
to either the first device
or the second device,

00:16:00.110 --> 00:16:03.640
LB here being our load balancer?

00:16:03.640 --> 00:16:05.860
So we've seemed to have
solved this problem.

00:16:05.860 --> 00:16:08.920
In so far as now we have
configured our DNS servers

00:16:08.920 --> 00:16:12.250
to return the IP address
of the load balancer,

00:16:12.250 --> 00:16:15.760
there's no problem of downtime
as we described a moment ago.

00:16:15.760 --> 00:16:20.890
Because if Server 1 goes offline
for whatever reason, no big deal.

00:16:20.890 --> 00:16:25.510
The load balancer should hopefully
just notice that and subsequently start

00:16:25.510 --> 00:16:29.710
proactively routing all incoming data
that reaches its IP address to Server 2

00:16:29.710 --> 00:16:30.940
and not Server 1.

00:16:30.940 --> 00:16:32.819
now how does the load balancer know?

00:16:32.819 --> 00:16:34.360
Well, either a human could intervene.

00:16:34.360 --> 00:16:37.240
Maybe someone gets a late night
call or text or page saying,

00:16:37.240 --> 00:16:39.490
uh oh, server 1 is down,
you better do something.

00:16:39.490 --> 00:16:42.070
And then he or she can manually
configure the load balancer

00:16:42.070 --> 00:16:44.590
to no longer send any
traffic to Server 1.

00:16:44.590 --> 00:16:47.620
That seems kind of stupid in an age
of automation and smart software.

00:16:47.620 --> 00:16:48.730
Maybe we can do better.

00:16:48.730 --> 00:16:49.870
And indeed, we can.

00:16:49.870 --> 00:16:52.630
A technique that's
often used by servers is

00:16:52.630 --> 00:16:55.570
something modeled from
the human world to use

00:16:55.570 --> 00:16:58.690
what you might describe as
heartbeats to actually configure

00:16:58.690 --> 00:17:02.350
the load balancer and Servers
1 and 2 to operate as follows.

00:17:02.350 --> 00:17:05.740
Maybe every second, every half a
second, maybe every five seconds

00:17:05.740 --> 00:17:10.359
you configure Server 1 and Server 2
to send some kind of heartbeat message

00:17:10.359 --> 00:17:11.750
to the load balancer.

00:17:11.750 --> 00:17:14.770
This is just a TCP/IP packet,
some kind of network packet

00:17:14.770 --> 00:17:17.650
that's the equivalent
of saying I'm alive.

00:17:17.650 --> 00:17:18.339
I'm alive.

00:17:18.339 --> 00:17:23.770
Or more goofily, like boom, boom, boom,
boom, ergo the heartbeat metaphor.

00:17:23.770 --> 00:17:26.680
But the point is that 1 and 2,
and any number of other servers,

00:17:26.680 --> 00:17:29.710
should be configured to
just constantly reassure

00:17:29.710 --> 00:17:31.800
the load balancer that they are alive.

00:17:31.800 --> 00:17:32.770
They are accessible.

00:17:32.770 --> 00:17:34.780
They are ready to receive traffic.

00:17:34.780 --> 00:17:36.809
And the load balancer, similarly--

00:17:36.809 --> 00:17:39.100
and you might see where this
is going-- can very simply

00:17:39.100 --> 00:17:42.100
be configured to listen
for that heartbeat.

00:17:42.100 --> 00:17:46.060
And if it ever doesn't hear a
heartbeat from Server 1 or Server 2,

00:17:46.060 --> 00:17:48.790
it should just assume
that something is wrong.

00:17:48.790 --> 00:17:50.030
The server has died.

00:17:50.030 --> 00:17:50.830
It's gone offline.

00:17:50.830 --> 00:17:52.390
Something bad has happened.

00:17:52.390 --> 00:17:55.090
So the load balancer
subsequently should simply not

00:17:55.090 --> 00:17:58.030
route any traffic to
that particular server

00:17:58.030 --> 00:18:00.490
until some human or
some automated process

00:18:00.490 --> 00:18:04.300
brings the server back alive, so to
speak, and the heartbeat resumes.

00:18:04.300 --> 00:18:06.880
Now, of course, this problem
doesn't go away permanently.

00:18:06.880 --> 00:18:09.630
If servers 1 and 2 stop
emitting a heartbeat,

00:18:09.630 --> 00:18:11.447
we really have no capacity for users.

00:18:11.447 --> 00:18:13.030
But that would be an extreme scenario.

00:18:13.030 --> 00:18:17.714
Hopefully it's just one or a few of
our servers go offline in that way.

00:18:17.714 --> 00:18:20.380
So we can configure our servers
for these heartbeats, which is--

00:18:20.380 --> 00:18:21.190
think about it--

00:18:21.190 --> 00:18:25.100
a very simple physiologically-inspired
solution to a problem.

00:18:25.100 --> 00:18:27.730
And even if it's not obvious
how you implemented it in code,

00:18:27.730 --> 00:18:30.370
it really is just an algorithm,
a simple set of instructions

00:18:30.370 --> 00:18:33.700
with which we can solve this problem.

00:18:33.700 --> 00:18:36.690
And yet, damnit, we've
introduced a new problem.

00:18:36.690 --> 00:18:40.050
And so this really is
the old leaky hose,

00:18:40.050 --> 00:18:43.150
where just as we've plugged
one leak or solved one problem,

00:18:43.150 --> 00:18:46.420
another one has sprung up
somewhere else along the line.

00:18:46.420 --> 00:18:48.850
So what's the problem now?

00:18:48.850 --> 00:18:49.870
What's the problem now?

00:18:49.870 --> 00:18:54.520
The whole motivation of introducing
Server Number 2, in addition

00:18:54.520 --> 00:18:58.930
to Server Number 1, was to make
sure that we have enough capacity,

00:18:58.930 --> 00:19:03.070
and better yet, to make sure that if
Server 1 or Server 2 goes offline,

00:19:03.070 --> 00:19:05.650
the other one can hopefully
pick up the load unless it's

00:19:05.650 --> 00:19:09.320
a super busy time with lots and
lots of users visiting all at once.

00:19:09.320 --> 00:19:12.070
So in fact, the general
idea at play here

00:19:12.070 --> 00:19:20.770
is high availability ensuring
that if one server goes down,

00:19:20.770 --> 00:19:22.930
you have other servers
that can pick up the load.

00:19:22.930 --> 00:19:26.419
Being highly available means you
can be tolerant to issues like that.

00:19:26.419 --> 00:19:28.210
And then load balancing,
of course, is just

00:19:28.210 --> 00:19:31.390
the mere process of splitting the
load across those two endpoints.

00:19:31.390 --> 00:19:34.390
But we have introduced another problem.

00:19:34.390 --> 00:19:44.710
This might be abbreviated SPOF, or more
explicitly, Single Point Of Failure.

00:19:44.710 --> 00:19:48.040
Just as I've solved one problem
by introducing this load balancer,

00:19:48.040 --> 00:19:50.800
so have I introduced a new
problem, which is this.

00:19:50.800 --> 00:19:52.870
There is now, as you
might infer from the name

00:19:52.870 --> 00:19:54.730
alone, a single point of failure.

00:19:54.730 --> 00:19:59.050
It's fine that I can now tolerate
Server 1 or Server 2 going down,

00:19:59.050 --> 00:20:01.920
but what can I not tolerate, clearly?

00:20:01.920 --> 00:20:04.430
What if the load balancer goes down?

00:20:04.430 --> 00:20:06.160
So this is a very real concern.

00:20:06.160 --> 00:20:09.071
Maybe the load balancer
itself gets overloaded.

00:20:09.071 --> 00:20:11.320
Maybe the load balancer
itself has some kind of issue.

00:20:11.320 --> 00:20:13.270
And if the load balancer
goes down, it doesn't

00:20:13.270 --> 00:20:16.120
matter how many web
servers I have down here,

00:20:16.120 --> 00:20:19.900
or how much money I've spent down
here to ensure my high availability.

00:20:19.900 --> 00:20:25.100
My server is offline if this single
point of failure indeed fails.

00:20:25.100 --> 00:20:27.640
Now, you'd like to think
that the load balancer--

00:20:27.640 --> 00:20:30.070
especially since it only
has one job in life--

00:20:30.070 --> 00:20:33.270
can at least handle more traffic
than any individual server.

00:20:33.270 --> 00:20:36.490
Indeed, clearly, it must be
the case that the load balancer

00:20:36.490 --> 00:20:39.490
is fast enough and capable
enough to handle twice as

00:20:39.490 --> 00:20:41.840
much traffic as any individual server.

00:20:41.840 --> 00:20:46.999
But that's generally accepted as
feasible insofar as your website.

00:20:46.999 --> 00:20:48.790
Your real intellectual
property is probably

00:20:48.790 --> 00:20:50.590
doing a lot of work--
talking to a database,

00:20:50.590 --> 00:20:53.506
writing out files, downloading things,
or any number of other features

00:20:53.506 --> 00:20:56.800
that just take more effort than
just routing data from one server

00:20:56.800 --> 00:20:58.990
to another as a load balancer does.

00:20:58.990 --> 00:21:00.970
But it doesn't matter
how performant it is.

00:21:00.970 --> 00:21:04.240
If the load balancer breaks,
goes offline for some reason,

00:21:04.240 --> 00:21:08.120
your entire infrastructure
is inaccessible.

00:21:08.120 --> 00:21:09.980
So how do we solve this?

00:21:09.980 --> 00:21:13.510
How do we go about and
architect a solution to this?

00:21:13.510 --> 00:21:15.910
Well, how did we address
this issue earlier?

00:21:15.910 --> 00:21:20.260
We addressed the issue of insufficient
capacity or potential downtime

00:21:20.260 --> 00:21:22.960
by just throwing
hardware at the problem.

00:21:22.960 --> 00:21:25.940
And so maybe we could
do that same thing here.

00:21:25.940 --> 00:21:29.260
Maybe we could just introduce
a second load balancer.

00:21:29.260 --> 00:21:31.540
I'll call this LB as well.

00:21:31.540 --> 00:21:33.940
And now we somehow have to--

00:21:33.940 --> 00:21:39.640
I feel like we're just endlessly going
to be adding more and more rectangles

00:21:39.640 --> 00:21:40.540
to the picture.

00:21:40.540 --> 00:21:46.480
But somehow, we need to be able to load
balance across now two servers and two

00:21:46.480 --> 00:21:47.980
load balancers.

00:21:47.980 --> 00:21:48.860
So how do we do this?

00:21:48.860 --> 00:21:52.660
Well, let me clean this up so that we
have a bit more room to play with here

00:21:52.660 --> 00:21:57.260
and consider how a pair of load
balancers might actually work.

00:21:57.260 --> 00:22:01.510
So if my first server is here
and my second server is here,

00:22:01.510 --> 00:22:07.720
and I'm proposing now to have two load
balancers-- one here and one here--

00:22:07.720 --> 00:22:12.460
surely, both of these have to
be able to talk to both servers.

00:22:12.460 --> 00:22:15.100
So we already have this necessity.

00:22:15.100 --> 00:22:18.820
And somehow, traffic has
to come from the internet

00:22:18.820 --> 00:22:23.497
into this set of load balancers,
but probably only to one,

00:22:23.497 --> 00:22:25.330
because we don't want
to solve this with DNS

00:22:25.330 --> 00:22:27.370
and just have two IP
addresses out there.

00:22:27.370 --> 00:22:30.160
Because if one breaks, we
can recreate the same problem

00:22:30.160 --> 00:22:32.090
as before if we're not careful.

00:22:32.090 --> 00:22:33.140
So what if we do this?

00:22:33.140 --> 00:22:37.390
What if we use this building block
of heartbeats in another way as well?

00:22:37.390 --> 00:22:40.600
What if we ensure that
our load balancers--

00:22:40.600 --> 00:22:45.740
plural-- have just one IP
address, which a moment ago seemed

00:22:45.740 --> 00:22:47.240
to create a single point of failure?

00:22:47.240 --> 00:22:48.590
But what if we do this?

00:22:48.590 --> 00:22:52.330
What if we also allow the
load balancers to talk to,

00:22:52.330 --> 00:22:57.940
to communicate over a network with each
other so that one of the load balancers

00:22:57.940 --> 00:23:00.940
is constantly saying to
the other, I'm alive.

00:23:00.940 --> 00:23:02.020
I'm alive.

00:23:02.020 --> 00:23:03.290
I'm alive.

00:23:03.290 --> 00:23:06.310
And so what the load balancers
could be configured to do

00:23:06.310 --> 00:23:10.400
is that only one of them operates
at any given point in time.

00:23:10.400 --> 00:23:14.830
But if the other server,
the other load balancer,

00:23:14.830 --> 00:23:19.330
no longer hears from that primary load
balancer because of the heartbeats

00:23:19.330 --> 00:23:21.790
that are ideally both being
emitted in both directions

00:23:21.790 --> 00:23:25.150
so that they can both be
assured of the other's up time--

00:23:25.150 --> 00:23:29.110
if the secondary load balancer stops
hearing the primary load balancer,

00:23:29.110 --> 00:23:32.560
the secondary load balancer
can just presumptuously

00:23:32.560 --> 00:23:37.050
reconfigure itself to take on
that one and only IP address,

00:23:37.050 --> 00:23:39.760
effectively assuming that the
first load balancer is not going

00:23:39.760 --> 00:23:41.740
to be responding to any traffic anyway.

00:23:41.740 --> 00:23:46.120
And the second load balancer can
simply take on the entire load itself.

00:23:46.120 --> 00:23:49.750
But the key difference now
in this particular solution

00:23:49.750 --> 00:23:53.920
is that there's only one IP address that
describes this whole architecture, only

00:23:53.920 --> 00:23:56.740
one IP address between
the two load balancers

00:23:56.740 --> 00:24:01.210
so we don't risk those potential dead
ends that we had a little bit ago

00:24:01.210 --> 00:24:03.710
with our back end servers.

00:24:03.710 --> 00:24:08.884
So now it's starting to get more
robust, more highly available.

00:24:08.884 --> 00:24:09.800
So that's pretty good.

00:24:09.800 --> 00:24:11.800
We've solved most of these problems.

00:24:11.800 --> 00:24:17.590
We've generously, though, swept one
problem underneath the rug, whereby

00:24:17.590 --> 00:24:20.217
every time I draw another rectangle--

00:24:20.217 --> 00:24:22.300
not just the first time,
but now the second time--

00:24:22.300 --> 00:24:26.200
and add some interconnectivity,
somehow, among them someone

00:24:26.200 --> 00:24:27.850
somewhere is spending some money.

00:24:27.850 --> 00:24:30.340
And indeed, I am solving
these problems thus far

00:24:30.340 --> 00:24:33.800
by throwing money at the problem,
and frankly introducing complexity.

00:24:33.800 --> 00:24:36.400
Already look at how many
arrows or edges there

00:24:36.400 --> 00:24:40.060
are now, which might simply refer
to physical wires, which is fine.

00:24:40.060 --> 00:24:43.990
But there's also a logical
configuration that's now necessary.

00:24:43.990 --> 00:24:47.530
And God forbid we have a third load
balancer for extra high availability

00:24:47.530 --> 00:24:49.420
or any number of servers here--

00:24:49.420 --> 00:24:52.510
13 or 20 or 100 or 1,000 servers.

00:24:52.510 --> 00:24:54.910
It's a lot of cross-connections--
not just physically,

00:24:54.910 --> 00:24:58.120
but logically in terms of
the requisite configuration.

00:24:58.120 --> 00:25:01.540
So this complexity does add up.

00:25:01.540 --> 00:25:04.690
And the cost certainly adds up.

00:25:04.690 --> 00:25:07.240
And now, once upon a
time-- and not all that

00:25:07.240 --> 00:25:11.750
long ago-- if a company wanted to
architect this kind of solution,

00:25:11.750 --> 00:25:14.590
you would literally
buy two load balancers,

00:25:14.590 --> 00:25:17.290
and you would buy two
or more web servers,

00:25:17.290 --> 00:25:19.690
and you would buy the
requisite physical ethernet

00:25:19.690 --> 00:25:21.070
cables to interconnect the two.

00:25:21.070 --> 00:25:23.320
And you'd probably buy a
whole bunch of other hardware

00:25:23.320 --> 00:25:26.279
that we've not even talked about,
like firewalls and switches and more.

00:25:26.279 --> 00:25:28.361
But you would physically
buy all of this hardware.

00:25:28.361 --> 00:25:30.520
You would physically
connect all of this hardware

00:25:30.520 --> 00:25:35.710
and configure it to implement
these several kinds of features.

00:25:35.710 --> 00:25:38.410
But the catch is that the
more and more hardware

00:25:38.410 --> 00:25:42.310
you buy, just probabilistically,
the more and more you

00:25:42.310 --> 00:25:44.320
invite some kind of failure.

00:25:44.320 --> 00:25:46.000
Maybe it's some stupid human error.

00:25:46.000 --> 00:25:49.310
But more realistically, one of
your hard drives is going to fail.

00:25:49.310 --> 00:25:52.960
And hard drives are typically rated for
the enterprise in terms of Mean Time

00:25:52.960 --> 00:25:57.230
Between Failure, MTBF,
which generally means

00:25:57.230 --> 00:26:01.100
how long should you expect a hard drive
to work on average before it fails.

00:26:01.100 --> 00:26:01.600
It breaks.

00:26:01.600 --> 00:26:02.900
It just stops working.

00:26:02.900 --> 00:26:05.500
So if you have a whole bunch
of servers, each of which

00:26:05.500 --> 00:26:08.390
has a whole bunch of hard
drives, at some point,

00:26:08.390 --> 00:26:11.862
combinatorially, one or more of
those drives is just going to fail,

00:26:11.862 --> 00:26:13.820
which is to say you're
going to have a problem,

00:26:13.820 --> 00:26:15.850
and you're going to
have to fix it yourself.

00:26:15.850 --> 00:26:19.940
At some point, too, you're going
to run out of physical space.

00:26:19.940 --> 00:26:22.850
In fact, perhaps one of the
most constraining resources,

00:26:22.850 --> 00:26:25.530
especially for startups, is
the physical space itself.

00:26:25.530 --> 00:26:28.780
You probably don't want to start housing
your servers in your physical office,

00:26:28.780 --> 00:26:32.530
because you need a special room for
it, typically, with enough cooling,

00:26:32.530 --> 00:26:36.750
with enough access, with enough
electricity, and enough humans

00:26:36.750 --> 00:26:37.750
to actually maintain it.

00:26:37.750 --> 00:26:41.345
Or you graduate from your own office
space and go to a data center,

00:26:41.345 --> 00:26:44.230
a co-location facility,
whereby you maybe

00:26:44.230 --> 00:26:47.500
rent space in a physical
cage with a locking door,

00:26:47.500 --> 00:26:49.690
inside of which you
put racks of servers,

00:26:49.690 --> 00:26:54.100
just racked up on big metal poles,
and you pack as many servers in there

00:26:54.100 --> 00:26:54.970
as you can.

00:26:54.970 --> 00:26:57.610
But at some point, you're
going to be bumping up

00:26:57.610 --> 00:27:03.340
against other constrained resources--
physical space, actual power capacity,

00:27:03.340 --> 00:27:07.220
cooling, as well as the
humans to actually run this.

00:27:07.220 --> 00:27:10.720
And so very quickly
does operations, ops,

00:27:10.720 --> 00:27:14.890
so to speak, become an increasing
cost and an increasing challenge.

00:27:14.890 --> 00:27:19.130
And one of the most alluring
features of the cloud, so to speak,

00:27:19.130 --> 00:27:23.350
is that you can move all
of these details off-site.

00:27:23.350 --> 00:27:28.710
And you can abstract many of these,
let's say, implementation details

00:27:28.710 --> 00:27:32.770
away whereby you yourself don't have
to worry about the physical wires.

00:27:32.770 --> 00:27:35.260
You don't have to worry about
the make and model of servers

00:27:35.260 --> 00:27:36.051
that you're buying.

00:27:36.051 --> 00:27:39.700
You don't have to worry about
things actually breaking,

00:27:39.700 --> 00:27:42.790
because someone else will
deal with that for you.

00:27:42.790 --> 00:27:46.000
But you have to still understand
the topology and the architecture

00:27:46.000 --> 00:27:51.080
and the features that you want to
implement so that you can actually

00:27:51.080 --> 00:27:53.130
configure them in the cloud.

00:27:53.130 --> 00:27:55.830
So what do you actually
get from cloud providers?

00:27:55.830 --> 00:27:57.830
There's any number of
them out there these days.

00:27:57.830 --> 00:28:01.760
But perhaps three of the biggest
are Amazon, Google, and Microsoft,

00:28:01.760 --> 00:28:05.087
all of whom offer, these days, of
very similar palettes of options.

00:28:05.087 --> 00:28:06.920
And it's outright
overwhelming, if you visit

00:28:06.920 --> 00:28:10.290
each of their web sites, just how
many cloud products they offer.

00:28:10.290 --> 00:28:13.610
But they would generally offer
a number of standard products

00:28:13.610 --> 00:28:16.740
in the cloud-- for instance,
a virtualized server.

00:28:16.740 --> 00:28:19.430
So you don't have to physically
buy a server these days

00:28:19.430 --> 00:28:22.970
and plug it into your own ethernet
connection, your own internet

00:28:22.970 --> 00:28:24.470
connection in your own office.

00:28:24.470 --> 00:28:27.320
You can instead
essentially rent a server

00:28:27.320 --> 00:28:29.550
in the cloud, which is to
say that Amazon, Google,

00:28:29.550 --> 00:28:31.520
Microsoft, or any number
of other companies

00:28:31.520 --> 00:28:34.850
will host that server
physically for you,

00:28:34.850 --> 00:28:37.520
and they will take care of the
issues of power and cooling.

00:28:37.520 --> 00:28:40.061
And if a hard drive fails,
they will go remove the old one

00:28:40.061 --> 00:28:41.060
and plug in the new one.

00:28:41.060 --> 00:28:43.610
And ideally, they will provide
you with backup services.

00:28:43.610 --> 00:28:46.970
But more sophisticated
than that, they can also

00:28:46.970 --> 00:28:52.280
help us recreate, in software,
this kind of topology.

00:28:52.280 --> 00:28:56.750
In other words, even without having
a human physically wire together

00:28:56.750 --> 00:28:59.390
this kind of graph, so to speak,
that we've been building up

00:28:59.390 --> 00:29:02.600
here logically, thanks
to software these days,

00:29:02.600 --> 00:29:06.260
you can implement this whole paradigm--

00:29:06.260 --> 00:29:08.720
not with physical cables,
not with physical devices,

00:29:08.720 --> 00:29:10.610
but with software virtually.

00:29:10.610 --> 00:29:11.460
What does that mean?

00:29:11.460 --> 00:29:13.640
It means that humans, over
the past several years,

00:29:13.640 --> 00:29:17.810
have been writing software that mimics
the behavior of physical servers.

00:29:17.810 --> 00:29:21.290
Humans have been writing software
that mimics the behavior of a router.

00:29:21.290 --> 00:29:26.060
Humans have been writing software that
mimics the behavior of a load balancer.

00:29:26.060 --> 00:29:30.470
And implementing mimics the behavior
of-- really, we're just building,

00:29:30.470 --> 00:29:34.940
in software, what historically
might have been implemented entirely

00:29:34.940 --> 00:29:35.682
in hardware.

00:29:35.682 --> 00:29:37.640
And even that's a bit of
an oversimplification.

00:29:37.640 --> 00:29:40.077
Because even when something
is bought as hardware,

00:29:40.077 --> 00:29:43.160
there is, of course, software running
on that hardware that actually makes

00:29:43.160 --> 00:29:44.360
it do something.

00:29:44.360 --> 00:29:46.730
But they're no longer dedicated devices.

00:29:46.730 --> 00:29:50.750
You can use generic commodity
PC server hardware, really,

00:29:50.750 --> 00:29:54.740
and transform that hardware into
a certain role, a back end web

00:29:54.740 --> 00:29:58.550
server, a back end database, a
load balancer, a router, a switch,

00:29:58.550 --> 00:30:00.180
any number of other things.

00:30:00.180 --> 00:30:02.930
And so what you were getting from
companies like Amazon and Google

00:30:02.930 --> 00:30:06.230
and Microsoft and more is
the ability to build up

00:30:06.230 --> 00:30:09.050
your infrastructure in software.

00:30:09.050 --> 00:30:16.190
In fact, the buzzword here, the acronym,
is IaaS, Infrastructure as a Service.

00:30:16.190 --> 00:30:19.910
So you sign up for an account on any
of those companies' cloud services web

00:30:19.910 --> 00:30:23.030
sites, and you put in your credit
card information or your invoicing

00:30:23.030 --> 00:30:26.780
information, and you literally, via
a command line tool-- so a keyboard,

00:30:26.780 --> 00:30:30.380
or via a nice, web-based
graphical user interface, GUI--

00:30:30.380 --> 00:30:34.907
do you point and click and say, give
me two servers and one load balancer.

00:30:34.907 --> 00:30:36.740
Or if you have enough
money in the bank, you

00:30:36.740 --> 00:30:40.040
say give me two servers
and two load balancers

00:30:40.040 --> 00:30:41.990
configured for high availability.

00:30:41.990 --> 00:30:44.360
Or better yet, you
don't say any of that.

00:30:44.360 --> 00:30:48.410
You just tell the provider, give me a
web server and give me a load balancer,

00:30:48.410 --> 00:30:52.710
and you deal with the process of
scaling those things as needed.

00:30:52.710 --> 00:30:56.360
In fact, a buzzword de jeur is auto
scaling, which refers to a feature,

00:30:56.360 --> 00:30:59.720
implemented in software,
whereby if a cloud

00:30:59.720 --> 00:31:03.740
provider notices that your servers
are getting a lot of traffic--

00:31:03.740 --> 00:31:05.880
business is good, or
it's the holiday season,

00:31:05.880 --> 00:31:10.220
and you are bumping up against
just how many users your one or two

00:31:10.220 --> 00:31:12.230
or three or more servers can handle--

00:31:12.230 --> 00:31:17.030
auto-scaling is a feature that will
enable the cloud provider to just turn

00:31:17.030 --> 00:31:21.320
on, virtually, more servers for you
so that you go from two to three

00:31:21.320 --> 00:31:21.980
automatically.

00:31:21.980 --> 00:31:25.770
You can be happily asleep
in the middle of the night,

00:31:25.770 --> 00:31:28.780
and even though your traffic
is peaking, it doesn't matter.

00:31:28.780 --> 00:31:30.830
Your architecture is
going to auto scale.

00:31:30.830 --> 00:31:33.320
And better yet--
especially financially--

00:31:33.320 --> 00:31:37.820
if the cloud provider notices, maybe 12
hours later-- oh, all of your customers

00:31:37.820 --> 00:31:41.270
have gone to sleep, we don't really
need all of this excess capacity.

00:31:41.270 --> 00:31:43.400
Or maybe the holidays
are now in the past.

00:31:43.400 --> 00:31:45.200
You really don't need
this excess capacity.

00:31:45.200 --> 00:31:50.090
Auto scaling also dictates that those
servers can be virtually turned off.

00:31:50.090 --> 00:31:51.435
So you're no longer using them.

00:31:51.435 --> 00:31:53.060
You're no longer load bouncing to them.

00:31:53.060 --> 00:31:56.310
And most importantly, you're
no longer paying for them.

00:31:56.310 --> 00:32:00.010
So this is a really, really
nice value add at this point.

00:32:00.010 --> 00:32:02.780
There's no human crawling around
on the floor rewiring things

00:32:02.780 --> 00:32:04.130
and plugging in new servers.

00:32:04.130 --> 00:32:07.460
There's no finance person having to
approve the PO to actually order more

00:32:07.460 --> 00:32:09.220
servers just to increase your capacity.

00:32:09.220 --> 00:32:13.310
And most importantly, there is
no latency between the time when

00:32:13.310 --> 00:32:16.580
you notice, oh, my god, we're
getting really successful

00:32:16.580 --> 00:32:18.527
and can't handle our load-- uh oh.

00:32:18.527 --> 00:32:20.360
It's going to be a two,
three-week lead time

00:32:20.360 --> 00:32:22.460
before we can even get
in the more servers.

00:32:22.460 --> 00:32:25.910
Thanks to cloud computing, you can
literally log in to Amazon's, Google's,

00:32:25.910 --> 00:32:28.210
Microsoft's web site
and, click, click, click,

00:32:28.210 --> 00:32:32.360
have more server capacity
within seconds, within minutes,

00:32:32.360 --> 00:32:37.790
far faster than the physical
world traditionally allowed.

00:32:37.790 --> 00:32:40.460
So those are just some
of the features now

00:32:40.460 --> 00:32:45.320
that we gain from outsourcing
to the so-called cloud.

00:32:45.320 --> 00:32:48.510
So where does some of
this capability come from?

00:32:48.510 --> 00:32:51.960
Well, it turns out that
over the past many years,

00:32:51.960 --> 00:32:54.230
humans have been getting
better and better and better

00:32:54.230 --> 00:32:58.460
at packing more physical hardware
into the same form factor,

00:32:58.460 --> 00:32:59.720
into the same physical space.

00:32:59.720 --> 00:33:02.420
So at the level of CPUs,
the brains of a computer,

00:33:02.420 --> 00:33:05.810
we humans have gotten much better at
packing more and more transistors,

00:33:05.810 --> 00:33:07.520
for instance, onto a CPU.

00:33:07.520 --> 00:33:11.070
And transistors are the little switches
that can turn things on and off--

00:33:11.070 --> 00:33:12.670
0 and 1, 1 and 0.

00:33:12.670 --> 00:33:14.570
So you can store more
information and you

00:33:14.570 --> 00:33:17.260
can do more with that
information more quickly.

00:33:17.260 --> 00:33:19.820
CPUs today also have
more cores, which you

00:33:19.820 --> 00:33:23.180
can think of as mini CPUs
inside of the main CPU,

00:33:23.180 --> 00:33:25.550
so that a computer with
multiple cores can literally

00:33:25.550 --> 00:33:28.280
do multiple things at a time.

00:33:28.280 --> 00:33:32.060
But the funny thing is that we
humans, over the past decade or two,

00:33:32.060 --> 00:33:35.150
really haven't been getting
fundamentally faster at life.

00:33:35.150 --> 00:33:38.090
At the end of the day, I can
only check my email so quickly.

00:33:38.090 --> 00:33:39.990
I can only post on Facebook so quickly.

00:33:39.990 --> 00:33:43.670
I can only check out
from Amazon so quickly.

00:33:43.670 --> 00:33:47.784
Because we humans have, of course,
a finite speed to ourselves.

00:33:47.784 --> 00:33:49.950
We're not just getting--
we're not doubling in speed

00:33:49.950 --> 00:33:52.790
a la Moore's law every year or two.

00:33:52.790 --> 00:33:57.140
So we have, it would seem, a lot of
excess computing capacity these days.

00:33:57.140 --> 00:34:00.080
Computers are getting so darn
fast, we don't necessarily

00:34:00.080 --> 00:34:03.830
know what to do with all of these
CPU cycles and with all of the RAM

00:34:03.830 --> 00:34:06.860
that we can fit into the same
physical box at half the price

00:34:06.860 --> 00:34:08.989
that it cost us last year.

00:34:08.989 --> 00:34:12.469
And so manufacturers
and companies realize

00:34:12.469 --> 00:34:17.179
that we could actually build a
business on this increased capacity.

00:34:17.179 --> 00:34:23.277
We can implement the computer
equivalent of timesharing, so to speak,

00:34:23.277 --> 00:34:25.610
which has long been with us
in the history of computing.

00:34:25.610 --> 00:34:27.620
But we can do this on a
much more massive scale

00:34:27.620 --> 00:34:33.679
now by taking one physical server
that has maybe two CPUs, or 16 CPUs,

00:34:33.679 --> 00:34:38.570
or 64 CPUs, and maybe gigabytes--

00:34:38.570 --> 00:34:41.090
tens of gigabytes or hundreds
of gigabytes of RAM--

00:34:41.090 --> 00:34:45.110
all inside of the same physical device,
plug it in to an internet connection,

00:34:45.110 --> 00:34:50.810
and then run special software on that
one server that creates the illusion

00:34:50.810 --> 00:34:54.920
that there's multiple servers
living inside of that box.

00:34:54.920 --> 00:34:58.850
And this virtualization
software is implemented

00:34:58.850 --> 00:35:02.480
by way of software called a virtual
machine, or virtual machine monitor,

00:35:02.480 --> 00:35:04.430
or another word might be hypervisor.

00:35:04.430 --> 00:35:07.190
There's different ways to describe
essentially the same thing.

00:35:07.190 --> 00:35:11.390
But a virtual machine
is a piece of software

00:35:11.390 --> 00:35:15.590
running on a computer inside of which
is running some other operating system,

00:35:15.590 --> 00:35:16.370
typically.

00:35:16.370 --> 00:35:19.070
So you might have one
server running Windows.

00:35:19.070 --> 00:35:24.620
But inside of that server are multiple
virtual machines, each of which

00:35:24.620 --> 00:35:25.880
itself is running Windows.

00:35:25.880 --> 00:35:29.509
So you might be able to chop up one
computer into 10, or even into 100.

00:35:29.509 --> 00:35:31.550
Or perhaps more commonly,
you might have a server

00:35:31.550 --> 00:35:34.340
running Linux or some
Unix-based operating system,

00:35:34.340 --> 00:35:35.930
also with virtual machines on it.

00:35:35.930 --> 00:35:37.721
But those virtual
machines might be running

00:35:37.721 --> 00:35:42.797
Linux themselves, or Unix, or Windows,
or any number of versions of Windows.

00:35:42.797 --> 00:35:43.880
And so this is the beauty.

00:35:43.880 --> 00:35:48.080
When you have so much
excess capacity and so many

00:35:48.080 --> 00:35:50.150
available CPU cycles
and so much RAM, you

00:35:50.150 --> 00:35:56.490
can slice that up and then sell portions
of the server's capacity to customers.

00:35:56.490 --> 00:36:01.310
And if you're really clever, you might
look at your customers' usage patterns

00:36:01.310 --> 00:36:05.810
and realize that, you know
what, it's not necessarily

00:36:05.810 --> 00:36:11.360
as simple as just taking my server and
dividing it up into n different slices,

00:36:11.360 --> 00:36:13.700
where n is a generic
variable for number,

00:36:13.700 --> 00:36:17.824
and then selling it or renting that
space, really, to end customers.

00:36:17.824 --> 00:36:18.740
Because you know what?

00:36:18.740 --> 00:36:21.800
Some of those customers might have some
booming businesses, which is great.

00:36:21.800 --> 00:36:24.110
But some of those customers
might not have many users.

00:36:24.110 --> 00:36:26.129
Maybe it's a few dozen.

00:36:26.129 --> 00:36:27.170
Maybe it's a few hundred.

00:36:27.170 --> 00:36:29.280
But it's really a drop in the bucket.

00:36:29.280 --> 00:36:34.580
So instead of selling my computing
resources to just end customers,

00:36:34.580 --> 00:36:37.730
maybe I'll sell it to twice as
many customers or three times

00:36:37.730 --> 00:36:41.690
as many customers, and essentially
over-sell my server's capacity,

00:36:41.690 --> 00:36:44.480
but expect that on
average, this is just going

00:36:44.480 --> 00:36:47.570
to work out because some customers
will be using a lot of those cycles

00:36:47.570 --> 00:36:49.460
because business is
good, and some won't be,

00:36:49.460 --> 00:36:51.501
because it's just they
don't have many customers,

00:36:51.501 --> 00:36:55.580
or really, it's a personal website
that doesn't get much usage anyway.

00:36:55.580 --> 00:36:58.490
And so for some time,
there has, of course,

00:36:58.490 --> 00:37:01.700
been this risk, when you sign up
for a web hosting company or a cloud

00:37:01.700 --> 00:37:05.450
provider, that your web site actually
might get really slow for reasons

00:37:05.450 --> 00:37:07.020
outside of your control.

00:37:07.020 --> 00:37:11.780
If you are co-located on a server that
some other booming business is on,

00:37:11.780 --> 00:37:17.120
your users might actually suffer if
your web host has oversold itself.

00:37:17.120 --> 00:37:19.249
And so in fact, this is
one of those situations

00:37:19.249 --> 00:37:20.540
where you get what you pay for.

00:37:20.540 --> 00:37:23.990
If you're googling around and
finding various cloud providers,

00:37:23.990 --> 00:37:26.360
or web hosting companies
more specifically,

00:37:26.360 --> 00:37:30.050
you might be able to find a deal,
like $10 per month or $50 per month,

00:37:30.050 --> 00:37:33.590
as opposed to $100 or
$200 or more per month.

00:37:33.590 --> 00:37:37.070
And you do get what you pay for, because
those fly-by-night operations that

00:37:37.070 --> 00:37:41.960
are selling you space and
capacity super cheaply probably

00:37:41.960 --> 00:37:44.000
are overselling and over-committing.

00:37:44.000 --> 00:37:45.740
So these are the trade-offs, too--

00:37:45.740 --> 00:37:48.590
how much money do you want
to save versus how much risk

00:37:48.590 --> 00:37:50.300
do you actually want to take on?

00:37:50.300 --> 00:37:53.549
Generally, it's safer to go with some
of the bigger fish these days, certainly

00:37:53.549 --> 00:37:57.500
when building a business, as you might
on a company like Amazon or Google

00:37:57.500 --> 00:38:00.300
or Microsoft or derivatives thereof.

00:38:00.300 --> 00:38:02.540
So just to paint a more
concrete technical picture

00:38:02.540 --> 00:38:06.534
of what virtualization is, here's a
picture, as you might think of it.

00:38:06.534 --> 00:38:08.450
So you have your physical
infrastructure here.

00:38:08.450 --> 00:38:12.080
So that's the actual server
from Dell or IBM or whoever.

00:38:12.080 --> 00:38:14.870
Then you have the host operating
system, which might be Windows,

00:38:14.870 --> 00:38:18.896
but is often Linux or some
variant of Unix instead.

00:38:18.896 --> 00:38:20.270
And then you have the hypervisor.

00:38:20.270 --> 00:38:22.940
This is the piece of
software that you install

00:38:22.940 --> 00:38:27.800
on your server that allows you to run
multiple virtual machines on top of it.

00:38:27.800 --> 00:38:29.995
And those virtual machines
can each run any number

00:38:29.995 --> 00:38:32.870
of different operating systems
themselves, or even different versions

00:38:32.870 --> 00:38:34.130
of operating systems.

00:38:34.130 --> 00:38:38.359
And so depicted here up top are
the disparate guest OS operating

00:38:38.359 --> 00:38:39.650
systems that might be on there.

00:38:39.650 --> 00:38:42.670
Maybe this is Linux and Solaris,
and this is Windows itself,

00:38:42.670 --> 00:38:44.170
or any number of other combinations.

00:38:44.170 --> 00:38:47.870
Whatever your customers want or whatever
you want to provide or essentially

00:38:47.870 --> 00:38:50.810
rent to customers, you can install.

00:38:50.810 --> 00:38:52.470
But you do pay a price.

00:38:52.470 --> 00:38:54.920
So as beautiful as this
situation is, and as clever

00:38:54.920 --> 00:38:59.180
as it is that we're leveraging
these excess resources by slicing up

00:38:59.180 --> 00:39:04.190
one server into the illusion of, in this
case, three, or more generally more,

00:39:04.190 --> 00:39:05.780
there is some overhead.

00:39:05.780 --> 00:39:10.670
Because this hypervisor has to be a
middleman between your guest operating

00:39:10.670 --> 00:39:13.562
systems and your host operating
system, the one actually

00:39:13.562 --> 00:39:15.020
physically installed on the server.

00:39:15.020 --> 00:39:17.600
And any layers of indirection
like this, so to speak,

00:39:17.600 --> 00:39:19.380
have got to cost you
some amount of time.

00:39:19.380 --> 00:39:21.500
If there's some work being
done here and you only

00:39:21.500 --> 00:39:23.780
have a finite number of
resources, the hypervisor

00:39:23.780 --> 00:39:27.050
itself is surely consuming
some of your resources.

00:39:27.050 --> 00:39:28.970
And gosh, this just
seems really inefficient,

00:39:28.970 --> 00:39:32.900
especially if all of your customers
are using the same operating system.

00:39:32.900 --> 00:39:38.300
My god, why do you have to have copies
of the same OS multiply installed?

00:39:38.300 --> 00:39:42.920
This just doesn't feel like it's
leveraging much economy of scale.

00:39:42.920 --> 00:39:46.940
And so it turns out there's a newer
technology that's gaining steam,

00:39:46.940 --> 00:39:51.260
and this is known not as virtualization,
per se, but containerization,

00:39:51.260 --> 00:39:54.980
the most popular instance of which
is perhaps a company called Docker.

00:39:54.980 --> 00:39:57.950
And the world of Docker
is a little shorter.

00:39:57.950 --> 00:40:00.585
It's a little smarter about
how resources are shared.

00:40:00.585 --> 00:40:02.960
You still have your infrastructure,
your physical server,

00:40:02.960 --> 00:40:04.876
and you still have your
host operating system,

00:40:04.876 --> 00:40:08.100
whether it's Linux or Unix
or something like that.

00:40:08.100 --> 00:40:11.810
But then instead of a hypervisor,
you have the Docker engine,

00:40:11.810 --> 00:40:15.680
which is really just an equivalent
of that base layer of software.

00:40:15.680 --> 00:40:17.930
But notice what's different.

00:40:17.930 --> 00:40:22.387
In this case here, we've
collapsed the previous picture.

00:40:22.387 --> 00:40:25.220
In fact, thanks to our friends at
Docker who put this together here,

00:40:25.220 --> 00:40:27.800
the guest OS has disappeared.

00:40:27.800 --> 00:40:29.940
And you instead have your
different applications

00:40:29.940 --> 00:40:31.730
and your different
binaries and libraries,

00:40:31.730 --> 00:40:34.820
as this abbreviation means, all
running on the Docker engine.

00:40:34.820 --> 00:40:36.180
Now, what does this mean?

00:40:36.180 --> 00:40:38.660
This means when running
Docker, you typically

00:40:38.660 --> 00:40:40.370
choose your operating system--

00:40:40.370 --> 00:40:44.920
for instance, Ubuntu Linux or Debian
Linux or something else altogether--

00:40:44.920 --> 00:40:49.400
and then you essentially share
that one operating system

00:40:49.400 --> 00:40:52.610
across multiple containers.

00:40:52.610 --> 00:40:55.100
Instead of virtual machines,
we now have containers.

00:40:55.100 --> 00:40:58.460
So in other words, you ensure
that your different slices all

00:40:58.460 --> 00:41:01.850
share some common software--
the kernel, so to speak,

00:41:01.850 --> 00:41:05.060
the base core of the operating system.

00:41:05.060 --> 00:41:09.080
But then you uniquely layer
on top of that base system,

00:41:09.080 --> 00:41:13.520
that base set of default files, whatever
customizations your customers or you

00:41:13.520 --> 00:41:16.880
yourself want, but you
share some of the resources.

00:41:16.880 --> 00:41:19.700
And long story short, what
this means is that containers

00:41:19.700 --> 00:41:21.740
tend to be a little lighter weight.

00:41:21.740 --> 00:41:25.490
There's less waste of resources because
there's less overhead of running them,

00:41:25.490 --> 00:41:29.720
which is to say that you can generally
start them even more quickly.

00:41:29.720 --> 00:41:34.070
And better yet, you can still
isolate your different products

00:41:34.070 --> 00:41:37.190
and your different services--
database and web server and email

00:41:37.190 --> 00:41:38.900
server and any number
of other features--

00:41:38.900 --> 00:41:43.340
all within the illusion of their own
installation, their own operating

00:41:43.340 --> 00:41:47.030
system, even though there are
some shared resources here.

00:41:47.030 --> 00:41:51.950
So this, too, has been made possible
by the capabilities of modern hardware

00:41:51.950 --> 00:41:54.590
and the cleverness, frankly,
of humans in actually

00:41:54.590 --> 00:42:01.470
finding solutions or creative uses
for those available resources.

00:42:01.470 --> 00:42:05.190
But what other features
or topics come into play

00:42:05.190 --> 00:42:07.400
in this world of cloud computing?

00:42:07.400 --> 00:42:11.051
We've talked about availability
and caching and costing, really

00:42:11.051 --> 00:42:13.050
figuring out where we're
going to actually spend

00:42:13.050 --> 00:42:17.430
our money by throwing hardware at
problems and scaling more generally.

00:42:17.430 --> 00:42:19.830
But there's also issues
of replication, which

00:42:19.830 --> 00:42:22.890
actually do relate to high
availability, so to speak.

00:42:22.890 --> 00:42:24.900
But replication refers
to duplication of data,

00:42:24.900 --> 00:42:27.360
and really backups more
generally as a topic.

00:42:27.360 --> 00:42:30.570
And then there's also some other funky
acronyms that are very much in vogue

00:42:30.570 --> 00:42:31.230
these days.

00:42:31.230 --> 00:42:33.390
Besides Infrastructure
as a Service, there's

00:42:33.390 --> 00:42:40.280
also Platform as a Service, PaaS,
or Software as a Service, SaaS.

00:42:40.280 --> 00:42:43.920
Now, SaaS, even if you've
not used it under this name,

00:42:43.920 --> 00:42:45.360
odds are you have been using it.

00:42:45.360 --> 00:42:50.370
If you do use Gmail or Outlook.com
or any web-based email service,

00:42:50.370 --> 00:42:52.740
you are using software as a service.

00:42:52.740 --> 00:42:55.572
You don't really know, or need
to care, where in the world

00:42:55.572 --> 00:42:58.530
your emails physically live, or how
many servers they're spread across,

00:42:58.530 --> 00:43:00.821
or how your data is backed
up, or for that matter, when

00:43:00.821 --> 00:43:03.960
you click Send, how the email
even gets from point A to point B.

00:43:03.960 --> 00:43:07.710
You are treating Gmail
and Outlook as a software

00:43:07.710 --> 00:43:12.690
as a service with all of the underlying
implementation details abstracted away.

00:43:12.690 --> 00:43:15.870
You just don't know or care
how it's implemented-- well,

00:43:15.870 --> 00:43:17.760
at least if everything is working.

00:43:17.760 --> 00:43:21.090
You probably do care
if something goes down.

00:43:21.090 --> 00:43:24.870
But there's this intermediate
step between this extreme form

00:43:24.870 --> 00:43:29.010
of abstraction where all you see
is just the top-level service.

00:43:29.010 --> 00:43:32.100
And the lowest level
implementation that we've

00:43:32.100 --> 00:43:34.110
discussed, which is
infrastructure as a service,

00:43:34.110 --> 00:43:36.690
whereby when using
something like Amazon,

00:43:36.690 --> 00:43:39.720
you literally click the button
that says give me a load balancer.

00:43:39.720 --> 00:43:42.214
You literally click a button
that says give me two servers.

00:43:42.214 --> 00:43:44.130
You literally click a
button that says give me

00:43:44.130 --> 00:43:46.560
a firewall or any number
of other features.

00:43:46.560 --> 00:43:49.560
So Amazon and Microsoft
and Google, to some extent,

00:43:49.560 --> 00:43:52.590
have all implemented
these low-level services

00:43:52.590 --> 00:43:55.050
that still require that you
understand the technology,

00:43:55.050 --> 00:43:59.250
and you understand networking, and you
understand scaling and availability.

00:43:59.250 --> 00:44:03.930
But you so much more easily and
inexpensively and efficiently--

00:44:03.930 --> 00:44:08.960
literally with just a laptop or desktop,
without any data center of your own--

00:44:08.960 --> 00:44:12.450
stitch together the topology or
the architecture that you actually

00:44:12.450 --> 00:44:14.640
want, albeit in the cloud.

00:44:14.640 --> 00:44:17.820
Platform as a service, though,
has arisen as a middle ground

00:44:17.820 --> 00:44:20.557
here, whereby you might
have services like Herouku,

00:44:20.557 --> 00:44:22.890
which you might have heard
of, which themselves actually

00:44:22.890 --> 00:44:28.950
run on infrastructures like Amazon
or Google or Microsoft or the like.

00:44:28.950 --> 00:44:32.100
But they provide themselves
a layer of abstraction

00:44:32.100 --> 00:44:34.650
that isn't quite as high
level, so to speak, as what

00:44:34.650 --> 00:44:36.510
you get from software as a service.

00:44:36.510 --> 00:44:40.500
In fact, these platforms as a service
don't provide you with applications.

00:44:40.500 --> 00:44:44.350
They just make it easier for you to
run your applications in the cloud.

00:44:44.350 --> 00:44:45.550
Now, what does that mean?

00:44:45.550 --> 00:44:49.740
Well, it's all fun and exciting
to understand load balancing

00:44:49.740 --> 00:44:52.020
and understand networking
and understand the need

00:44:52.020 --> 00:44:56.580
for multiple servers and the entire
conversation that we've had thus far.

00:44:56.580 --> 00:44:59.390
But at the end of the day,
if I'm a software developer

00:44:59.390 --> 00:45:01.380
or I'm trying to build
a business, all I care

00:45:01.380 --> 00:45:05.940
about is making my internet
application available to real users.

00:45:05.940 --> 00:45:09.390
I really don't care about
how many servers I have,

00:45:09.390 --> 00:45:12.802
how many databases I have, how the
load balancers talk to one another.

00:45:12.802 --> 00:45:14.760
That's all fine and
intellectually interesting.

00:45:14.760 --> 00:45:17.230
But I just want to get real work done.

00:45:17.230 --> 00:45:19.140
So I'm willing to pay
a bit more for this.

00:45:19.140 --> 00:45:22.590
I'm willing to pay some middleman,
like a Herouku, or any number

00:45:22.590 --> 00:45:24.870
of other services, a
platform as a service,

00:45:24.870 --> 00:45:27.450
to abstract away those kinds of details.

00:45:27.450 --> 00:45:30.180
So I have the wherewithal,
and I have the willingness

00:45:30.180 --> 00:45:33.740
to actually say host
this as a web server.

00:45:33.740 --> 00:45:34.890
So give me a web server.

00:45:34.890 --> 00:45:37.920
I will pay you some number of dollars
per month to give me a web server.

00:45:37.920 --> 00:45:41.310
But I want you, Herouku, to deal
with the auto scaling of it.

00:45:41.310 --> 00:45:43.330
I don't care how many servers it is.

00:45:43.330 --> 00:45:44.910
I don't care how they are connected.

00:45:44.910 --> 00:45:46.784
I don't care anything
about these heartbeats.

00:45:46.784 --> 00:45:49.880
I just want to have the
illusion, for my own sake,

00:45:49.880 --> 00:45:53.880
of just one server that
somehow grows or shrinks

00:45:53.880 --> 00:45:56.400
dynamically to handle my customer base.

00:45:56.400 --> 00:45:59.381
Meanwhile, things like load
balancing, I just want my customers

00:45:59.381 --> 00:46:00.630
to be able to reach my server.

00:46:00.630 --> 00:46:02.046
I don't care how it's implemented.

00:46:02.046 --> 00:46:04.470
I don't care how it's made
to be highly available.

00:46:04.470 --> 00:46:06.120
I just want that to work.

00:46:06.120 --> 00:46:10.050
And so companies like Herouku
provide these platforms

00:46:10.050 --> 00:46:12.924
as a service that just make
your life a little bit easier.

00:46:12.924 --> 00:46:15.840
And you don't have to think about
or know about or worry about as many

00:46:15.840 --> 00:46:16.560
of these details.

00:46:16.560 --> 00:46:18.140
Now, to be fair, if
something breaks, you

00:46:18.140 --> 00:46:20.140
might not understand
exactly what's going wrong,

00:46:20.140 --> 00:46:22.200
and you yourself might
not be able to solve it.

00:46:22.200 --> 00:46:27.810
Indeed, you might be entirely at the
mercy of the cloud provider, or the PAS

00:46:27.810 --> 00:46:30.150
provider, to solve the problem for you.

00:46:30.150 --> 00:46:32.280
But you're saving time.

00:46:32.280 --> 00:46:34.132
You're saving energy
elsewhere by not having

00:46:34.132 --> 00:46:36.840
to worry about those lower-level
implementation details, at least

00:46:36.840 --> 00:46:37.631
in the common case.

00:46:37.631 --> 00:46:42.030
But odds are you're paying a little more
to Herouku than you would to an Amazon

00:46:42.030 --> 00:46:46.480
directly because they're providing
you with this value-added service.

00:46:46.480 --> 00:46:48.900
So as cryptic as these
acronyms really mean,

00:46:48.900 --> 00:46:51.210
they're really just
referring to disparate levels

00:46:51.210 --> 00:46:54.300
of abstraction, all of which
somehow relate to the cloud.

00:46:54.300 --> 00:46:56.790
But infrastructure as a
service is a virtualization

00:46:56.790 --> 00:46:59.880
of these hardware ideas,
the physical cabling

00:46:59.880 --> 00:47:01.770
that we drew here on the screen.

00:47:01.770 --> 00:47:04.080
Software as a service really
is just that application

00:47:04.080 --> 00:47:05.288
that the user interacts with.

00:47:05.288 --> 00:47:09.090
And platform as a service
is an intermediate step,

00:47:09.090 --> 00:47:12.180
whereby you, in building
your software in the cloud,

00:47:12.180 --> 00:47:16.710
can worry a little bit about how to
actually make it available to users.

00:47:16.710 --> 00:47:19.320
But let's consider one
other challenge now--

00:47:19.320 --> 00:47:23.310
that of database replication
since, of course, thus far,

00:47:23.310 --> 00:47:26.340
we've been talking about a web server
as though it's the entire picture.

00:47:26.340 --> 00:47:28.560
But the reality is
most any business that

00:47:28.560 --> 00:47:31.002
has a web-based presence
or a mobile presence

00:47:31.002 --> 00:47:32.460
is going to be storing information.

00:47:32.460 --> 00:47:35.249
When users register, when
users check something out,

00:47:35.249 --> 00:47:38.040
add something to their shopping
cart, so to speak, all of that data

00:47:38.040 --> 00:47:40.560
needs to somehow be stored.

00:47:40.560 --> 00:47:44.830
So let's consider now what the
world really likely looks like.

00:47:44.830 --> 00:47:47.070
So here is my laptop again.

00:47:47.070 --> 00:47:51.480
And here is the cloud that's
between me and some service

00:47:51.480 --> 00:47:52.860
that I'm interested in.

00:47:52.860 --> 00:47:55.695
We'll assume for now that there
is some kind of load balancing.

00:47:55.695 --> 00:47:57.570
And I'm just going to
draw it a little bigger

00:47:57.570 --> 00:48:00.957
this time to suggest that-- let's
just think of it now as a black box.

00:48:00.957 --> 00:48:02.040
And maybe it's one server.

00:48:02.040 --> 00:48:02.970
Maybe it's two.

00:48:02.970 --> 00:48:03.930
Maybe it's more.

00:48:03.930 --> 00:48:07.200
But somehow or other, load
balancing is implemented.

00:48:07.200 --> 00:48:09.990
Then I'm going to have
all of my servers here,

00:48:09.990 --> 00:48:14.790
which we'll abstract away as maybe
three or more at this point-- one, two,

00:48:14.790 --> 00:48:16.770
and then we'll call this n.

00:48:16.770 --> 00:48:20.200
But a web server typically does
not do everything these days.

00:48:20.200 --> 00:48:22.530
In fact, it's been
trending for some time

00:48:22.530 --> 00:48:25.440
to actually have different servers
or different virtual machines,

00:48:25.440 --> 00:48:27.960
or even more recently,
different containers.

00:48:27.960 --> 00:48:30.180
Each provide individual services.

00:48:30.180 --> 00:48:32.440
Sometimes people call
these micro services

00:48:32.440 --> 00:48:36.030
if a container only does one, and
one very narrowly defined thing,

00:48:36.030 --> 00:48:39.540
like send emails, or save
information to a database,

00:48:39.540 --> 00:48:42.610
or respond to HTTP requests.

00:48:42.610 --> 00:48:46.830
So these back end web servers are not
the only types of servers we have.

00:48:46.830 --> 00:48:49.780
Odds are we at least have one database.

00:48:49.780 --> 00:48:53.160
So let's consider now
the implication of all

00:48:53.160 --> 00:48:56.400
of these architectural
decisions we've made thus far

00:48:56.400 --> 00:48:59.400
on how we actually store our data.

00:48:59.400 --> 00:49:03.780
So in simplest form, our
database might look like this.

00:49:03.780 --> 00:49:06.510
And for historical reasons, it's
generally drawn as a cylinder.

00:49:06.510 --> 00:49:08.580
And this is our database.

00:49:08.580 --> 00:49:12.210
Now, it's immediately
obvious that if all servers--

00:49:12.210 --> 00:49:16.620
1, 2, dot, dot, dot, n-- need to
save information or read information

00:49:16.620 --> 00:49:20.800
from a database, they've all got to
somehow communicate with that database

00:49:20.800 --> 00:49:25.300
so they all have some kind of
connectivity, physically or otherwise.

00:49:25.300 --> 00:49:29.850
So this seems fine so long as the
software that's running on servers 1,

00:49:29.850 --> 00:49:32.680
2, dot, dot, dot, and no matter
what language we're using,

00:49:32.680 --> 00:49:36.330
whether it's Java or Python or
PHP or C# or something else--

00:49:36.330 --> 00:49:40.740
so long as those servers can talk
to, via the network, this database,

00:49:40.740 --> 00:49:41.880
that's great.

00:49:41.880 --> 00:49:44.062
They can all save their
data to the same place,

00:49:44.062 --> 00:49:46.270
and they can all read their
data from the same place.

00:49:46.270 --> 00:49:48.360
So everything stays nicely in sync.

00:49:48.360 --> 00:49:51.030
But what's the first problem
that motivated the entirety

00:49:51.030 --> 00:49:54.680
of this discussion from the outset?

00:49:54.680 --> 00:49:59.440
Well, what if one database
isn't really enough?

00:49:59.440 --> 00:50:03.310
Well, we could take the
approach of vertically scaling

00:50:03.310 --> 00:50:07.210
our architecture, which is another
piece of jargon in this space.

00:50:07.210 --> 00:50:14.530
So vertical scaling means if your
one database isn't quite up to snuff,

00:50:14.530 --> 00:50:18.670
and you're running low on disk
space or capacity because of numbers

00:50:18.670 --> 00:50:22.360
of requests per second are, of course,
limited, you know what you can do?

00:50:22.360 --> 00:50:30.310
You can go ahead and disconnect this one
and go ahead and put in a bigger one,

00:50:30.310 --> 00:50:33.130
and therefore increase your capacity.

00:50:33.130 --> 00:50:36.640
And vertical scaling means
to really pay more money

00:50:36.640 --> 00:50:39.640
or get something higher
end, a higher, more premium

00:50:39.640 --> 00:50:42.730
model, a more expensive model that's
got more disk space and more RAM

00:50:42.730 --> 00:50:44.750
and a faster CPU or more CPUs.

00:50:44.750 --> 00:50:46.720
So you just throw
hardware at the problem--

00:50:46.720 --> 00:50:50.680
not in the sense of multiple servers,
but just one bigger and better server.

00:50:50.680 --> 00:50:52.117
But what are the challenges here?

00:50:52.117 --> 00:50:53.950
Well, if you've ever
bought a home computer,

00:50:53.950 --> 00:50:57.220
odds are whether it's been on Dell's
site or Microsoft's or Apple's

00:50:57.220 --> 00:51:00.310
or the like, you often have
this good, better, best thing

00:51:00.310 --> 00:51:04.090
where, for the top of the
line laptop or desktop,

00:51:04.090 --> 00:51:06.640
you're going to be
paying through the roof--

00:51:06.640 --> 00:51:08.620
through the nose, so to speak.

00:51:08.620 --> 00:51:11.470
You're going to be paying a premium
for that top of the line model.

00:51:11.470 --> 00:51:14.178
But you might actually be able to
save a decent number of dollars

00:51:14.178 --> 00:51:17.470
by going for the second
best or the third best,

00:51:17.470 --> 00:51:21.340
because the marginal gains
of each additional dollar

00:51:21.340 --> 00:51:22.695
really aren't all that much.

00:51:22.695 --> 00:51:24.820
Because for marketing
reasons, they know that there

00:51:24.820 --> 00:51:26.778
might be some people out
there that will always

00:51:26.778 --> 00:51:28.330
pay top dollar for the fastest one.

00:51:28.330 --> 00:51:30.163
But just because you're
paying twice as much

00:51:30.163 --> 00:51:33.650
doesn't mean the laptops is going
to be twice as good, for instance.

00:51:33.650 --> 00:51:37.120
So this is to say to vertically
scale your database, you might end up

00:51:37.120 --> 00:51:40.810
paying, through the nose, some
very expensive hardware just

00:51:40.810 --> 00:51:43.820
to eke out some more performance.

00:51:43.820 --> 00:51:45.610
But that's not even the biggest problem.

00:51:45.610 --> 00:51:48.350
The most fundamental problem
is at the end of the day,

00:51:48.350 --> 00:51:53.050
there is a top-of-the-line server for
your database that only can support

00:51:53.050 --> 00:51:56.020
a finite number of database
connections at a time,

00:51:56.020 --> 00:51:58.480
or a finite number of reads
or writes, so to speak,

00:51:58.480 --> 00:52:00.357
saving and reading from the database.

00:52:00.357 --> 00:52:03.190
So at some point or other, it doesn't
matter how much money you have

00:52:03.190 --> 00:52:05.530
or how willing you are to
throw hardware at the problem.

00:52:05.530 --> 00:52:10.070
There exists no server that can handle
more users than you currently have.

00:52:10.070 --> 00:52:14.320
So at some point, you actually
have to put away your wallet

00:52:14.320 --> 00:52:17.680
and put back on the engineering
hat alone and figure out

00:52:17.680 --> 00:52:24.220
how to not vertically scale, but
horizontally scale your architecture.

00:52:24.220 --> 00:52:29.860
And by this, I mean actually introducing
not just one big, fancy server,

00:52:29.860 --> 00:52:33.112
but two or more maybe
smaller, cheaper servers.

00:52:33.112 --> 00:52:35.320
In fact, one of the things
that companies like Google

00:52:35.320 --> 00:52:37.810
were especially good
at early on was using

00:52:37.810 --> 00:52:42.910
off-the-shelf, inexpensive hardware and
building supercomputers out of them,

00:52:42.910 --> 00:52:44.770
but much more economically
than they might

00:52:44.770 --> 00:52:46.450
have had they gone top
of the line everywhere,

00:52:46.450 --> 00:52:48.200
even though that would
mean fewer servers.

00:52:48.200 --> 00:52:50.696
Better to get more cheaper
servers and somehow

00:52:50.696 --> 00:52:53.320
figure out how to interconnect
them and write the software that

00:52:53.320 --> 00:52:56.410
lets them all be useful
simultaneously so that we can instead

00:52:56.410 --> 00:52:59.620
have a picture that looks a
bit more like this, with maybe

00:52:59.620 --> 00:53:03.310
a pair of databases in the picture now.

00:53:03.310 --> 00:53:05.650
Of course, we've now
created that same problem

00:53:05.650 --> 00:53:09.130
that we had earlier about
where does the data go.

00:53:09.130 --> 00:53:10.990
Where does the traffic
or the users flow,

00:53:10.990 --> 00:53:14.960
especially now where we have one
on the left and one on the right?

00:53:14.960 --> 00:53:18.460
So there's a couple of solutions here,
but there are some different problems

00:53:18.460 --> 00:53:19.900
that arise with databases.

00:53:19.900 --> 00:53:27.600
If we very simply put a load balancer in
here, LB, and route traffic uniformly--

00:53:27.600 --> 00:53:30.010
say, to the left or to the right--

00:53:30.010 --> 00:53:32.180
that's probably not the best thing.

00:53:32.180 --> 00:53:34.960
Because then you're going to
end up with a world where you're

00:53:34.960 --> 00:53:39.460
saving some data for a user
here and some data for a user

00:53:39.460 --> 00:53:42.757
here just by chance, because you're
using round robin, so to speak,

00:53:42.757 --> 00:53:45.340
or just some probabilistic
heuristic where some of the traffic

00:53:45.340 --> 00:53:47.140
goes this way, some of
the traffic goes that way.

00:53:47.140 --> 00:53:48.160
And that's not so good.

00:53:48.160 --> 00:53:48.670
OK.

00:53:48.670 --> 00:53:54.820
But we could solve that by somehow
making sure that if this user, User A,

00:53:54.820 --> 00:54:00.400
visits my web site, I should always
send him or her to the same database.

00:54:00.400 --> 00:54:02.440
And you can do this in a couple of ways.

00:54:02.440 --> 00:54:04.510
You can enforce some
notion of stickiness,

00:54:04.510 --> 00:54:07.180
so to speak, whereby you
somehow notice that, oh, this is

00:54:07.180 --> 00:54:09.010
User A. We've seen him or her before.

00:54:09.010 --> 00:54:12.130
Let's make sure we send him
to this database on the left

00:54:12.130 --> 00:54:14.020
and not the one on the right.

00:54:14.020 --> 00:54:18.070
Or you can more formally use
a process known as sharding.

00:54:18.070 --> 00:54:20.380
In fact, this is very common
early on in databases,

00:54:20.380 --> 00:54:24.010
and even in websites like Facebook,
where you have so many users

00:54:24.010 --> 00:54:26.830
that you need to start splitting
them across multiple databases.

00:54:26.830 --> 00:54:28.360
But gosh, how to do that?

00:54:28.360 --> 00:54:31.300
Back in the earliest days of
Facebook, what they might have done

00:54:31.300 --> 00:54:35.275
was put all Harvard users on one
database, all MIT users on another,

00:54:35.275 --> 00:54:37.600
all BU users on another, and so forth.

00:54:37.600 --> 00:54:40.210
Because Facebook, as you may
recall, started scaling out

00:54:40.210 --> 00:54:41.620
initially to disparate schools.

00:54:41.620 --> 00:54:44.410
That was a wonderful
opportunity to shard

00:54:44.410 --> 00:54:49.930
their data by putting similar users
in their respective databases.

00:54:49.930 --> 00:54:51.700
And at the time, I
think you couldn't even

00:54:51.700 --> 00:54:54.240
be friends with people in other
schools, at least very early

00:54:54.240 --> 00:54:58.050
on, because those databases,
presumably, were independent,

00:54:58.050 --> 00:55:01.440
or certainly could
have been topologicaly.

00:55:01.440 --> 00:55:04.530
Or you might do something more
simple that doesn't create

00:55:04.530 --> 00:55:06.420
some problems like isolation there.

00:55:06.420 --> 00:55:10.320
Maybe all of your users whose last
name start with A go on one server,

00:55:10.320 --> 00:55:12.400
and all of your users
whose names start with B

00:55:12.400 --> 00:55:14.170
go on another server, and so forth.

00:55:14.170 --> 00:55:18.330
So you can almost hash your users, to
borrow a terminology from hash tables,

00:55:18.330 --> 00:55:20.970
and decide where to put that data.

00:55:20.970 --> 00:55:24.690
Of course, that does not help
with backups or redundancy.

00:55:24.690 --> 00:55:28.470
Because if you're putting all of your
A names here and all of your B names

00:55:28.470 --> 00:55:31.230
here, what happens, god forbid,
if one of the servers goes down?

00:55:31.230 --> 00:55:33.600
You've lost half of your customers.

00:55:33.600 --> 00:55:36.390
So it would seem that no matter
how you balance the load,

00:55:36.390 --> 00:55:39.850
you really want to maintain
duplicates of data.

00:55:39.850 --> 00:55:42.870
And so there's a few different
ways people solve this.

00:55:42.870 --> 00:55:45.510
In fact, let me go
ahead and temporarily go

00:55:45.510 --> 00:55:50.940
back to that first model, where we
had a really fancy, bigger database

00:55:50.940 --> 00:55:53.670
that I'll deliberately
draw as pretty big.

00:55:53.670 --> 00:55:57.450
And this is big in the sense that
it can respond to requests quickly

00:55:57.450 --> 00:55:59.280
and it can store a lot of data.

00:55:59.280 --> 00:56:03.630
This might be generally called our
primary or our master database.

00:56:03.630 --> 00:56:06.420
And it's where our data
goes to live long term.

00:56:06.420 --> 00:56:09.900
It's where data is written to, so to
speak, and could also be read from.

00:56:09.900 --> 00:56:13.380
But if we're going to bump up
against some limit of how much work

00:56:13.380 --> 00:56:15.450
this database can do
at once, it would be

00:56:15.450 --> 00:56:18.960
nice to have some secondary
servers or tertiary servers.

00:56:18.960 --> 00:56:24.240
So a very common paradigm would be to
use this primary database for writes--

00:56:24.240 --> 00:56:25.860
we'll abbreviate it w--

00:56:25.860 --> 00:56:29.790
and then also have maybe a couple
of smaller databases, or even

00:56:29.790 --> 00:56:35.610
the same size databases, that are
meant for reads, abbreviated R.

00:56:35.610 --> 00:56:39.600
And so long as these databases are
somehow talking to one another,

00:56:39.600 --> 00:56:41.400
this topology will just work.

00:56:41.400 --> 00:56:43.410
This is a feature known as replication.

00:56:43.410 --> 00:56:46.650
So long as the databases
are configured in such a way

00:56:46.650 --> 00:56:50.310
that any time data is written to
the primary database or the master

00:56:50.310 --> 00:56:55.440
database, that data gets replicated
to any replicas, as they're called.

00:56:55.440 --> 00:57:02.539
Meanwhile, servers 1, 2, and n should
also be able to talk to these replicas.

00:57:02.539 --> 00:57:05.580
And if your code is smart enough--
and you would have to think about this

00:57:05.580 --> 00:57:10.260
and design this into your codebase--
you could ensure that any time you

00:57:10.260 --> 00:57:15.270
read data from a database, it comes from
one, or really any, of your replicas,

00:57:15.270 --> 00:57:18.240
replicas in the sense that they
are meant to have duplicate data.

00:57:18.240 --> 00:57:22.170
But anytime you write data-- a
SQL INSERT or UPDATE or DELETE,

00:57:22.170 --> 00:57:24.150
as opposed to a SQL SELECT--

00:57:24.150 --> 00:57:28.710
you only send your write operations
to the primary or master database

00:57:28.710 --> 00:57:32.026
and leave it to it to then
replicate it to the read replicas.

00:57:32.026 --> 00:57:33.900
Now, of course, there
are some problems here.

00:57:33.900 --> 00:57:35.070
There's some latency, potentially.

00:57:35.070 --> 00:57:36.320
Maybe it takes a split second.

00:57:36.320 --> 00:57:39.190
Maybe it takes a couple seconds
for that data to replicate.

00:57:39.190 --> 00:57:43.440
So things might not appear to
be updated instantaneously.

00:57:43.440 --> 00:57:48.900
But you have now a very scalable model
in that if you have the money to spend,

00:57:48.900 --> 00:57:54.000
you can even have more read replicas and
have even more and more read capacity.

00:57:54.000 --> 00:57:58.080
Of course, you're going to eventually
bump up against a limit on your rights,

00:57:58.080 --> 00:58:00.840
at which point we need to
introduce another solution.

00:58:00.840 --> 00:58:03.690
But again, this is a very
incremental approach.

00:58:03.690 --> 00:58:06.890
And we can throw a little bit of
money at the problem each time

00:58:06.890 --> 00:58:09.090
and a little bit of
engineering wherewithal

00:58:09.090 --> 00:58:11.491
in order to at least get us
over that next ledge, which

00:58:11.491 --> 00:58:14.490
is super important, certainly, when
you're first building your business.

00:58:14.490 --> 00:58:17.582
If You don't necessarily have the
resources to go all in on things,

00:58:17.582 --> 00:58:19.290
you at least want to
get over this hurdle

00:58:19.290 --> 00:58:24.340
or at least build in some capacity
for the next load of users.

00:58:24.340 --> 00:58:27.300
So what if we run out
of capacity, though,

00:58:27.300 --> 00:58:31.090
with that that writable server,
the master database, so to speak?

00:58:31.090 --> 00:58:33.270
We need to be a little more clever.

00:58:33.270 --> 00:58:37.860
And it turns out we can borrow this
idea of these horizontal arrows

00:58:37.860 --> 00:58:43.110
here to replicate our data, but
for a slightly different purpose.

00:58:43.110 --> 00:58:47.400
We could still have a pretty
souped up writable database.

00:58:47.400 --> 00:58:51.750
But we could have another one, maybe
identical in its specs, writable.

00:58:51.750 --> 00:58:54.960
But somehow, these things need to be
able to synchronize with themselves.

00:58:54.960 --> 00:58:57.920
And maybe there's still some
read replicas over here--

00:58:57.920 --> 00:59:01.200
R for read, and another
one over here, R for read.

00:59:01.200 --> 00:59:03.750
And these are all somehow
interconnected as well.

00:59:03.750 --> 00:59:07.860
But you can have what's called
master master replication, whereby

00:59:07.860 --> 00:59:12.144
your server's code writes
to one of these servers.

00:59:12.144 --> 00:59:13.560
And maybe it's either of them now.

00:59:13.560 --> 00:59:15.840
Maybe the load balancer actually
does send some of the writes

00:59:15.840 --> 00:59:17.423
this way, some of the writes this way.

00:59:17.423 --> 00:59:20.190
But the master database,
the writable ones now,

00:59:20.190 --> 00:59:24.534
are configured, in software, to
replicate horizontally, so to speak.

00:59:24.534 --> 00:59:26.700
So here too, you might have
a little bit of latency.

00:59:26.700 --> 00:59:28.491
It might take a few
milliseconds or seconds

00:59:28.491 --> 00:59:30.060
for the data to actually replicate.

00:59:30.060 --> 00:59:34.800
But at least now we've doubled
the capacity for our writes

00:59:34.800 --> 00:59:38.560
so as to handle twice as
many writable operations.

00:59:38.560 --> 00:59:41.790
And we can continue to hang more
and more read replicas off of these

00:59:41.790 --> 00:59:46.740
if you want in order to
handle more and more users.

00:59:46.740 --> 00:59:50.950
And so this is the challenge and,
dare say, the fun of engineering

00:59:50.950 --> 00:59:54.192
architecturally-- understanding
some of these basic building blocks.

00:59:54.192 --> 00:59:56.650
And even if you might not know
the particular manufacturers

00:59:56.650 --> 00:59:59.140
or how you physically
configure the servers,

00:59:59.140 --> 01:00:02.300
or how in software you configure
these servers, at the end of the day,

01:00:02.300 --> 01:00:05.950
these really are just puzzle pieces
that can somehow be interlocked.

01:00:05.950 --> 01:00:09.280
And these puzzle pieces can be used
to solve more and more interesting

01:00:09.280 --> 01:00:10.090
problems.

01:00:10.090 --> 01:00:15.760
But to our discussion PaaS and Software
as a Service and Infrastructure

01:00:15.760 --> 01:00:19.490
as a Service, there's also these
different layers of abstraction.

01:00:19.490 --> 01:00:22.810
And so thematic throughout
this in all of our discussions

01:00:22.810 --> 01:00:23.850
has been this layering.

01:00:23.850 --> 01:00:27.802
Indeed, we started, really, down here
with those zeros and ones and bits,

01:00:27.802 --> 01:00:30.010
and very quickly went to
Ascii, and very quickly went

01:00:30.010 --> 01:00:33.010
to colors and images
and videos and so forth.

01:00:33.010 --> 01:00:36.042
Because once you understand some of
those ingredients or puzzle pieces,

01:00:36.042 --> 01:00:37.750
can you build something
more interesting?

01:00:37.750 --> 01:00:39.790
And then can you slap a name on it--

01:00:39.790 --> 01:00:43.392
sometimes cryptic, like
IaaS, or PaaS, or SaaS?

01:00:43.392 --> 01:00:45.100
But at the end of the
day, those are just

01:00:45.100 --> 01:00:48.580
labels that describe, really,
black boxes, inside of which

01:00:48.580 --> 01:00:52.360
is a decent amount of complexity,
a clever amount of engineering,

01:00:52.360 --> 01:00:55.270
but ultimately, a solution to a problem.

01:00:55.270 --> 01:00:58.810
And so in cloud computing, do we really
have this catch-all phrase that's

01:00:58.810 --> 01:01:03.430
referring to a whole class of solutions
to problems that ultimately are all

01:01:03.430 --> 01:01:07.090
about getting one's business or
getting one's personal website

01:01:07.090 --> 01:01:11.140
out on the internet for users to
access, whether via laptops or desktops

01:01:11.140 --> 01:01:12.820
or mobile devices and more?

01:01:12.820 --> 01:01:14.710
So at the end of the
day, what is the cloud?

01:01:14.710 --> 01:01:16.240
It's this evolving definition.

01:01:16.240 --> 01:01:21.230
It's this evolving class of services
that just continues to grow.

01:01:21.230 --> 01:01:23.840
But each of those services
is solving a problem.

01:01:23.840 --> 01:01:30.880
Each of those problems derives from
plugging one hole in a leaky hose,

01:01:30.880 --> 01:01:33.580
seeing another one spring up,
and then addressing that one,

01:01:33.580 --> 01:01:36.430
and then layering on top of those
solutions these are abstractions,

01:01:36.430 --> 01:01:39.700
and ultimately some marketing
speak, like cloud computing itself,

01:01:39.700 --> 01:01:43.120
so that you can build, out of these
more sophisticated puzzle pieces,

01:01:43.120 --> 01:01:45.640
bigger and better solutions
to actual problems

01:01:45.640 --> 01:01:49.860
you have when you're trying
to build your own site.