1
00:00:00,000 --> 00:00:03,920
[MUSIC PLAYING]

2
00:00:03,920 --> 00:00:15,537


3
00:00:15,537 --> 00:00:18,870
BRIAN YU: Welcome back, everyone, to Web
Programming with Python and JavaScript.

4
00:00:18,870 --> 00:00:21,940
And today we're going to look at
things from a different perspective.

5
00:00:21,940 --> 00:00:25,110
So we've spent the past several weeks
working on designing and building

6
00:00:25,110 --> 00:00:27,990
and programming web applications
using Python and JavaScript.

7
00:00:27,990 --> 00:00:30,630
We've talked about using
frameworks like Flask and Django

8
00:00:30,630 --> 00:00:33,660
in order to actually write the code
that will run on our web servers

9
00:00:33,660 --> 00:00:35,940
and then writing JavaScript code
that runs on the client side

10
00:00:35,940 --> 00:00:39,000
inside of a user's browser in order
to allow for additional functionality

11
00:00:39,000 --> 00:00:39,750
to happen.

12
00:00:39,750 --> 00:00:43,200
But the focus today is going to be
less about actually writing the web

13
00:00:43,200 --> 00:00:46,350
applications but what happens after
you've written those web applications.

14
00:00:46,350 --> 00:00:48,090
And you want to take
your web application

15
00:00:48,090 --> 00:00:49,140
and deploy it to the internet.

16
00:00:49,140 --> 00:00:51,140
After you've written it,
after you've tested it,

17
00:00:51,140 --> 00:00:54,750
what concerns have to be considered
when we think about taking that web

18
00:00:54,750 --> 00:00:56,970
application and then deploying it?

19
00:00:56,970 --> 00:00:59,970
And the main focus of today is
going to be all about scalability,

20
00:00:59,970 --> 00:01:03,810
this idea of a web application might
work well if just a couple of users

21
00:01:03,810 --> 00:01:04,455
are using it.

22
00:01:04,455 --> 00:01:07,080
But what happens when the web
application starts to get popular

23
00:01:07,080 --> 00:01:08,400
as more people start to use it?

24
00:01:08,400 --> 00:01:12,270
And as your application starts to have
to deal with multiple different people

25
00:01:12,270 --> 00:01:15,270
potentially accessing
your data at the same time

26
00:01:15,270 --> 00:01:17,970
and trying to use your
application at the same time, what

27
00:01:17,970 --> 00:01:21,630
sorts of considerations do you need
to take into account when that starts

28
00:01:21,630 --> 00:01:22,740
to happen?

29
00:01:22,740 --> 00:01:24,900
And so we can begin
with a simple picture.

30
00:01:24,900 --> 00:01:28,380
We might imagine that this diagram
here represents your web server.

31
00:01:28,380 --> 00:01:30,450
And when a user comes
along, that user is

32
00:01:30,450 --> 00:01:32,580
going to be connecting to
that web server somehow.

33
00:01:32,580 --> 00:01:34,371
They're going to connect
to the web server.

34
00:01:34,371 --> 00:01:37,320
Your server, which might be running
Flask or Django or some other web

35
00:01:37,320 --> 00:01:40,080
framework, is going to need
to process that request,

36
00:01:40,080 --> 00:01:43,020
figure out what sort of response
to present back to the user,

37
00:01:43,020 --> 00:01:45,210
and then deliver that
request back to the user.

38
00:01:45,210 --> 00:01:51,270
But a server can only do
finitely many things per second.

39
00:01:51,270 --> 00:01:54,030
How do we typically measure
how many things a server

40
00:01:54,030 --> 00:01:55,620
can do in a given amount of time?

41
00:01:55,620 --> 00:01:57,540
Any idea what the metric
for that usually is?

42
00:01:57,540 --> 00:02:00,797


43
00:02:00,797 --> 00:02:03,130
So the standard metric for
that is a unit of measurement

44
00:02:03,130 --> 00:02:06,070
called hertz, which represents
the number of calculations

45
00:02:06,070 --> 00:02:07,900
that a computer can
do in a given second.

46
00:02:07,900 --> 00:02:09,941
Or more commonly, as we
hear nowadays, gigahertz,

47
00:02:09,941 --> 00:02:12,400
or billions of computations,
which are very simple

48
00:02:12,400 --> 00:02:16,840
computations like adding two numbers
together or checking whether or not

49
00:02:16,840 --> 00:02:17,980
a number is equal to zero.

50
00:02:17,980 --> 00:02:21,190
Simple calculations like that
amassed over billions and billions

51
00:02:21,190 --> 00:02:23,050
of computations is
generally the way we'll

52
00:02:23,050 --> 00:02:25,870
measure how many things a server
can do at the same time in a given

53
00:02:25,870 --> 00:02:28,690
amount of time, like a period
of one second, for instance.

54
00:02:28,690 --> 00:02:32,414
And so a given server can only
do finitely many number of things

55
00:02:32,414 --> 00:02:34,330
in a given amount of
time-- in a given second,

56
00:02:34,330 --> 00:02:36,250
for instance-- which
means that there are only

57
00:02:36,250 --> 00:02:39,190
a finitely many number of users
that a server could potentially

58
00:02:39,190 --> 00:02:40,780
respond to in a given second.

59
00:02:40,780 --> 00:02:45,280
And so if a server can only respond
to 100 users in a given second, what

60
00:02:45,280 --> 00:02:48,400
happens when user number
101 comes along and tries

61
00:02:48,400 --> 00:02:50,670
to make a request to the
server in that same second?

62
00:02:50,670 --> 00:02:52,420
How is the server going
to deal with that?

63
00:02:52,420 --> 00:02:54,970
And these issues surrounding
scalability are the issues

64
00:02:54,970 --> 00:02:56,050
that we're going to be exploring today.

65
00:02:56,050 --> 00:02:59,216
What happens when it's not just one
user are trying to connect to our server

66
00:02:59,216 --> 00:03:02,560
but potentially many users that are
all trying to connect to our server

67
00:03:02,560 --> 00:03:03,890
at the same time?

68
00:03:03,890 --> 00:03:06,670
So what are some ideas for how we
might deal with this situation?

69
00:03:06,670 --> 00:03:09,520
Our server is a finite
machine that can only

70
00:03:09,520 --> 00:03:11,432
deal with so many users per second.

71
00:03:11,432 --> 00:03:13,390
And suddenly we find that
our web application's

72
00:03:13,390 --> 00:03:16,210
gotten popular enough that
we have more than that number

73
00:03:16,210 --> 00:03:19,000
of users trying to access our
application at the same time.

74
00:03:19,000 --> 00:03:23,160
What might we want to do about that?

75
00:03:23,160 --> 00:03:27,516
AUDIENCE: You could add more memory
and resources to the actual server

76
00:03:27,516 --> 00:03:29,814
to make it a beefier kind of machine.

77
00:03:29,814 --> 00:03:30,730
BRIAN YU: Yeah, great.

78
00:03:30,730 --> 00:03:32,970
We could add more resources
to the server we have.

79
00:03:32,970 --> 00:03:34,660
We can add more memory to the server.

80
00:03:34,660 --> 00:03:37,860
We can, in other words, try and
make the server faster, increase

81
00:03:37,860 --> 00:03:40,169
the processing power of that server.

82
00:03:40,169 --> 00:03:42,460
And so this is something we
might-- well, first of all,

83
00:03:42,460 --> 00:03:44,710
before we get there, I'll talk
a little bit about benchmarking.

84
00:03:44,710 --> 00:03:46,543
So benchmarking is
something you'll probably

85
00:03:46,543 --> 00:03:49,020
want to do first, this process
of figuring out just how

86
00:03:49,020 --> 00:03:51,150
much your server can actually handle.

87
00:03:51,150 --> 00:03:54,180
Your server has a maximum
capacity, but you might not

88
00:03:54,180 --> 00:03:57,180
know upfront just what
that capacity is, just

89
00:03:57,180 --> 00:03:58,920
how many users your server can handle.

90
00:03:58,920 --> 00:04:02,891
And it's probably not a good idea to
go about waiting until that server hits

91
00:04:02,891 --> 00:04:05,640
the capacity, until you've reached
the point where your server can

92
00:04:05,640 --> 00:04:08,223
no longer handle any more users,
before you realize, oh, yeah,

93
00:04:08,223 --> 00:04:09,540
that's what the capacity is.

94
00:04:09,540 --> 00:04:11,456
And so benchmarking is
something you'll likely

95
00:04:11,456 --> 00:04:15,210
want to do first in order to load test
or stress test, as it's often called--

96
00:04:15,210 --> 00:04:18,351
testing those servers in order to
make sure you know what the limit is.

97
00:04:18,351 --> 00:04:20,100
And once you know that,
then you can start

98
00:04:20,100 --> 00:04:22,812
to think about what to do if you
were to ever exceed that limit.

99
00:04:22,812 --> 00:04:25,020
And so the idea that was
brought up here is something

100
00:04:25,020 --> 00:04:26,561
that we might call vertical scaling--

101
00:04:26,561 --> 00:04:30,060
this idea that if our server
as it is now isn't good enough,

102
00:04:30,060 --> 00:04:31,920
isn't performant enough
in order to handle

103
00:04:31,920 --> 00:04:36,120
all of the users that might be coming
in order to use our web application,

104
00:04:36,120 --> 00:04:39,180
then what we might want to do is
scale that server up and make it

105
00:04:39,180 --> 00:04:43,254
a larger server, for instance, that is
able to have more processing capacity,

106
00:04:43,254 --> 00:04:45,420
that's able to operate
faster, that has more memory,

107
00:04:45,420 --> 00:04:48,607
for instance, that can then allow it
to handle that additional capacity.

108
00:04:48,607 --> 00:04:51,690
So you might imagine that if this is
our server and this is its connection

109
00:04:51,690 --> 00:04:54,356
and we realize that more and more
connections are going to start

110
00:04:54,356 --> 00:04:55,830
coming in, what do we need to do?

111
00:04:55,830 --> 00:04:58,710
We can vertically scale that
server, make it more performant

112
00:04:58,710 --> 00:05:01,680
by adding more memory, for instance,
to that server in order to allow

113
00:05:01,680 --> 00:05:03,970
it to respond to that sort of thing.

114
00:05:03,970 --> 00:05:07,130
So what are the drawbacks or
limitations of vertical scaling?

115
00:05:07,130 --> 00:05:08,880
Where might we go wrong
with this process,

116
00:05:08,880 --> 00:05:12,150
or why is it not a perfect
solution to all of our problems

117
00:05:12,150 --> 00:05:15,984
when it comes to scalability?

118
00:05:15,984 --> 00:05:20,269
AUDIENCE: Well, it's not, I mean,
I guess, for lack of a better word,

119
00:05:20,269 --> 00:05:22,810
it's not very scalable because
you just have this one machine

120
00:05:22,810 --> 00:05:24,770
that you're trying to
make bigger and bigger.

121
00:05:24,770 --> 00:05:30,255
At some point, it's probably going to
get really expensive or impossible.

122
00:05:30,255 --> 00:05:31,130
BRIAN YU: Yeah, sure.

123
00:05:31,130 --> 00:05:33,756
So this is maybe not as
scalable as we would like.

124
00:05:33,756 --> 00:05:35,630
The idea might be that
eventually we're going

125
00:05:35,630 --> 00:05:38,090
to hit a point where it's going
to be impossible to just keep

126
00:05:38,090 --> 00:05:41,131
getting a bigger and bigger server
because with a single server, wherever

127
00:05:41,131 --> 00:05:44,480
we're getting the server from, there's
probably a maximum processing power

128
00:05:44,480 --> 00:05:46,130
they can put inside of a single server.

129
00:05:46,130 --> 00:05:49,280
And so we're eventually going to hit
some sort of limit on vertical scaling

130
00:05:49,280 --> 00:05:51,770
where our servers can only
get so powerful inside of just

131
00:05:51,770 --> 00:05:53,010
a single server.

132
00:05:53,010 --> 00:05:54,379
So what might we do then?

133
00:05:54,379 --> 00:05:56,670
How can we-- if we still need
to scale our application,

134
00:05:56,670 --> 00:05:58,640
still need to deal with
more users that are all

135
00:05:58,640 --> 00:06:00,800
trying to access the
application at the same time,

136
00:06:00,800 --> 00:06:03,560
and we can't just keep
growing this one server, what

137
00:06:03,560 --> 00:06:05,757
do we do in that situation?

138
00:06:05,757 --> 00:06:07,130
AUDIENCE: Get another server.

139
00:06:07,130 --> 00:06:07,970
BRIAN YU: Get another server.

140
00:06:07,970 --> 00:06:08,520
Great.

141
00:06:08,520 --> 00:06:10,269
And so if this is
called vertical scaling,

142
00:06:10,269 --> 00:06:13,610
this idea of taking our existing
server and adding more processing

143
00:06:13,610 --> 00:06:16,730
power to it in order to make it more
performant, than adding more servers

144
00:06:16,730 --> 00:06:18,440
we might call horizontal scaling.

145
00:06:18,440 --> 00:06:21,770
The idea there being that if we
have a single server previously

146
00:06:21,770 --> 00:06:24,970
and now we want to be able to handle
more load coming from more places,

147
00:06:24,970 --> 00:06:26,540
then instead of just
having one server, maybe we

148
00:06:26,540 --> 00:06:29,180
think about splitting this up
now into two different servers

149
00:06:29,180 --> 00:06:32,960
where each of the servers is able to
handle users, able to process requests,

150
00:06:32,960 --> 00:06:38,100
and deal with users that are
coming in in that sense as well.

151
00:06:38,100 --> 00:06:42,080
But what problems might arise
now, now that we have two servers

152
00:06:42,080 --> 00:06:45,095
that we're trying to run
our web application on?

153
00:06:45,095 --> 00:06:47,945
AUDIENCE: You still want
one database, so if they're

154
00:06:47,945 --> 00:06:50,320
trying to write the same
database, something like that.

155
00:06:50,320 --> 00:06:51,932
There might be risk condition.

156
00:06:51,932 --> 00:06:54,890
BRIAN YU: Great so one potential
concern is what happens with the data.

157
00:06:54,890 --> 00:06:56,389
We have a database somewhere, right?

158
00:06:56,389 --> 00:06:59,120
We might be in a PostgreSQL
database, like we did in project one,

159
00:06:59,120 --> 00:07:01,449
for instance, where
both of these servers

160
00:07:01,449 --> 00:07:02,990
need to somehow access that database.

161
00:07:02,990 --> 00:07:05,516
And maybe they're accessing
the database at the same time,

162
00:07:05,516 --> 00:07:07,140
and concerns might arise there as well.

163
00:07:07,140 --> 00:07:11,510
And we'll talk about how to deal with
scaling our databases later on today as

164
00:07:11,510 --> 00:07:12,232
well.

165
00:07:12,232 --> 00:07:13,190
What else might happen?

166
00:07:13,190 --> 00:07:16,837
That's certainly something
that might come up.

167
00:07:16,837 --> 00:07:18,920
What initial challenge
might come up if a user now

168
00:07:18,920 --> 00:07:22,406
tries to access my web application?

169
00:07:22,406 --> 00:07:24,576
AUDIENCE: They don't know
which server to go to.

170
00:07:24,576 --> 00:07:25,270
BRIAN YU: Great.

171
00:07:25,270 --> 00:07:27,650
The user doesn't really
know which server to go to.

172
00:07:27,650 --> 00:07:30,020
We somehow need to have
some way of figuring out

173
00:07:30,020 --> 00:07:32,720
if a user comes in, do we send
them to this server over here

174
00:07:32,720 --> 00:07:34,230
or this server over there.

175
00:07:34,230 --> 00:07:36,025
So how do we address that problem?

176
00:07:36,025 --> 00:07:38,900
And so oftentimes this is addressed
through another piece of hardware

177
00:07:38,900 --> 00:07:43,364
that sits in between the user and the
server, which we call a load balancer.

178
00:07:43,364 --> 00:07:46,280
And the load balancer's job is
effectively to solve that very problem,

179
00:07:46,280 --> 00:07:47,900
to wait for a user to come in.

180
00:07:47,900 --> 00:07:50,540
And the load balancer simply
is going to try and detect,

181
00:07:50,540 --> 00:07:52,454
when the user comes
in, what should happen.

182
00:07:52,454 --> 00:07:54,620
Should we send the user to
this server, or should we

183
00:07:54,620 --> 00:07:56,030
send the user to that server?

184
00:07:56,030 --> 00:07:58,030
And load balancer needs
to make those decisions.

185
00:07:58,030 --> 00:08:00,710
So when the user comes in, they
send them to either one server

186
00:08:00,710 --> 00:08:02,741
or to another server.

187
00:08:02,741 --> 00:08:04,990
So how might a load balancer
make those decisions now?

188
00:08:04,990 --> 00:08:06,830
So somehow the load
balancer needs to decide.

189
00:08:06,830 --> 00:08:07,413
User comes in.

190
00:08:07,413 --> 00:08:10,190
Do we send them to server A or server B?

191
00:08:10,190 --> 00:08:14,000
What strategies or algorithms
might a load balancer

192
00:08:14,000 --> 00:08:17,877
want to employ in order to determine
which of the different servers

193
00:08:17,877 --> 00:08:18,710
to send the user to?

194
00:08:18,710 --> 00:08:20,876
Maybe it's going to be only
two, as in this diagram.

195
00:08:20,876 --> 00:08:23,390
But maybe in the case of an
even larger web application,

196
00:08:23,390 --> 00:08:24,930
we have scaled it up to
more than two servers.

197
00:08:24,930 --> 00:08:27,110
There are three, four,
five, or even more servers

198
00:08:27,110 --> 00:08:31,089
that the load balancer needs to decide
which one should the user go to.

199
00:08:31,089 --> 00:08:32,630
And there are many potential answers.

200
00:08:32,630 --> 00:08:36,449
But what are some possibilities for what
the load balancer could be doing here?

201
00:08:36,449 --> 00:08:38,270
AUDIENCE: I've heard of round robining.

202
00:08:38,270 --> 00:08:38,870
BRIAN YU: Round robining.

203
00:08:38,870 --> 00:08:39,230
Great.

204
00:08:39,230 --> 00:08:41,960
So that if I have five different
servers, we take the first user,

205
00:08:41,960 --> 00:08:42,830
send them to server one.

206
00:08:42,830 --> 00:08:44,780
Take the second user,
send them to server two.

207
00:08:44,780 --> 00:08:46,640
Third user goes to server
three, then four, then five.

208
00:08:46,640 --> 00:08:48,620
And when the next one comes in,
we can send him back to one,

209
00:08:48,620 --> 00:08:51,860
just sort of alternating and circling
between all of the possible servers

210
00:08:51,860 --> 00:08:52,720
that we have.

211
00:08:52,720 --> 00:08:53,960
It's certainly an option.

212
00:08:53,960 --> 00:08:57,510
Other choices that we might have?

213
00:08:57,510 --> 00:08:58,010
Yeah.

214
00:08:58,010 --> 00:09:00,010
AUDIENCE: Probably communicate
with the server, see if it's busy.

215
00:09:00,010 --> 00:09:03,510
And then if it's busy, then don't send
anything to it, something like that.

216
00:09:03,510 --> 00:09:05,218
BRIAN YU: Yeah, so we
can try potentially

217
00:09:05,218 --> 00:09:06,930
communicating with the servers maybe.

218
00:09:06,930 --> 00:09:09,766
Server number one has a lot
of users on it right now,

219
00:09:09,766 --> 00:09:11,640
and server number two
doesn't have very many.

220
00:09:11,640 --> 00:09:14,310
If we could somehow get
the servers to tell us

221
00:09:14,310 --> 00:09:16,620
what the current load is,
how many users are currently

222
00:09:16,620 --> 00:09:19,652
using either one of those servers,
then maybe our load balancer can

223
00:09:19,652 --> 00:09:21,360
be intelligent about
that and figure out,

224
00:09:21,360 --> 00:09:24,930
well, if server number two doesn't have
very many users using it right now,

225
00:09:24,930 --> 00:09:28,116
then we may as well direct more
traffic there in order to help to,

226
00:09:28,116 --> 00:09:29,990
as the load balancer
name might imply, trying

227
00:09:29,990 --> 00:09:32,580
to balance out the load on
each of these two servers

228
00:09:32,580 --> 00:09:36,510
so that no one is facing a lot
more in terms of resource usage

229
00:09:36,510 --> 00:09:38,550
than the other server is.

230
00:09:38,550 --> 00:09:40,440
Other ideas that we
might throw out there?

231
00:09:40,440 --> 00:09:43,390


232
00:09:43,390 --> 00:09:43,890
OK.

233
00:09:43,890 --> 00:09:45,780
So those are some of
the basic strategies

234
00:09:45,780 --> 00:09:47,820
that might come into play when
we think about load balancing.

235
00:09:47,820 --> 00:09:49,964
One very simple option
might just be random choice

236
00:09:49,964 --> 00:09:52,630
where just, when the user comes
in, you effectively flip a coin.

237
00:09:52,630 --> 00:09:54,500
If it's heads, send them
to server A. If it's tails,

238
00:09:54,500 --> 00:09:57,583
send them to server B, where we just
try to randomly and evenly distribute

239
00:09:57,583 --> 00:09:58,110
people.

240
00:09:58,110 --> 00:10:01,440
Round robin is certainly an option,
where you circle amongst the servers

241
00:10:01,440 --> 00:10:02,400
that you do have.

242
00:10:02,400 --> 00:10:04,650
And then you have this
idea of fewest connections,

243
00:10:04,650 --> 00:10:08,610
where you check the servers and figure
out which one has the least load

244
00:10:08,610 --> 00:10:11,790
and try to send the user that comes
in to the server that has the least

245
00:10:11,790 --> 00:10:13,650
load at that particular time.

246
00:10:13,650 --> 00:10:16,560
And what might be some of
the drawbacks or benefits

247
00:10:16,560 --> 00:10:18,270
of these compared to each other?

248
00:10:18,270 --> 00:10:20,100
If fewest connections
seems to make sense,

249
00:10:20,100 --> 00:10:23,670
where if server A is
less busy than server B,

250
00:10:23,670 --> 00:10:26,657
then it makes sense to send the
user to server A, why might we--

251
00:10:26,657 --> 00:10:28,740
what might be a drawback
of that approach compared

252
00:10:28,740 --> 00:10:31,553
to a random choice or a
round robin-like approach?

253
00:10:31,553 --> 00:10:34,178
What are the trade-offs that we
face when making that decision?

254
00:10:34,178 --> 00:10:36,668
AUDIENCE: It can depend on
what people are actually doing.

255
00:10:36,668 --> 00:10:39,158
So even though there may be
few connections on one server,

256
00:10:39,158 --> 00:10:42,644
there may be seven
people that are actually

257
00:10:42,644 --> 00:10:45,305
using a lot of the server's
resources for something.

258
00:10:45,305 --> 00:10:45,930
BRIAN YU: Sure.

259
00:10:45,930 --> 00:10:48,600
The number of users that are
using a particular server

260
00:10:48,600 --> 00:10:51,510
might not be a perfect
proxy for how much load

261
00:10:51,510 --> 00:10:52,920
that server is actually facing.

262
00:10:52,920 --> 00:10:55,110
Because if there are a
hundred users on server one

263
00:10:55,110 --> 00:10:57,450
but they're really just looking
at a couple static pages

264
00:10:57,450 --> 00:11:00,150
and aren't doing anything very
computationally intensive,

265
00:11:00,150 --> 00:11:02,339
but people on server B,
there are fewer of them

266
00:11:02,339 --> 00:11:04,380
but they're really doing
more work, then maybe we

267
00:11:04,380 --> 00:11:07,120
would prefer to send someone to
server A instead, for instance.

268
00:11:07,120 --> 00:11:09,720
So number of users or
number of connections

269
00:11:09,720 --> 00:11:13,920
might not be the perfect way of
measuring how much activity is going on

270
00:11:13,920 --> 00:11:14,710
in the servers.

271
00:11:14,710 --> 00:11:17,520
And you can imagine that we might
try and make our load balancing

272
00:11:17,520 --> 00:11:20,780
algorithms more sophisticated or more
complex by trying to figure out, well,

273
00:11:20,780 --> 00:11:22,830
really just how much is
the load on each of these

274
00:11:22,830 --> 00:11:25,120
and figure out what would
really make more sense.

275
00:11:25,120 --> 00:11:27,990
But then what sorts of issues
start to come up there?

276
00:11:27,990 --> 00:11:31,350
What's the trade-off that we face there?

277
00:11:31,350 --> 00:11:31,850
Yeah.

278
00:11:31,850 --> 00:11:35,944
AUDIENCE: Well, now load balancing's
going to become expensive [INAUDIBLE]..

279
00:11:35,944 --> 00:11:36,610
BRIAN YU: Great.

280
00:11:36,610 --> 00:11:38,740
Now load balancing starts
to become more expensive.

281
00:11:38,740 --> 00:11:41,698
But if we want the user to be able
to get a fast response from server A

282
00:11:41,698 --> 00:11:44,270
or server B, we've now introduced
this intermediary piece

283
00:11:44,270 --> 00:11:46,190
of hardware, this load
balancer, that's going

284
00:11:46,190 --> 00:11:49,315
to have to spend time calculating and
processing which of these two servers

285
00:11:49,315 --> 00:11:52,119
is actually going to be the
better server to send the user to.

286
00:11:52,119 --> 00:11:53,660
And it's going to take time, latency.

287
00:11:53,660 --> 00:11:56,720
It's going to take some computational
power in order to figure out

288
00:11:56,720 --> 00:11:58,249
where to ultimately send that user.

289
00:11:58,249 --> 00:12:00,290
And so there's definitely
that trade-off as well,

290
00:12:00,290 --> 00:12:03,290
whereas in a random choice,
a round robin type model,

291
00:12:03,290 --> 00:12:05,480
we can save a lot of
that computational energy

292
00:12:05,480 --> 00:12:07,610
by not worrying about
which of these servers

293
00:12:07,610 --> 00:12:09,800
is more busy or less
busy at any given time

294
00:12:09,800 --> 00:12:12,470
and just send the user
to a particular server

295
00:12:12,470 --> 00:12:14,810
without needing to do
those sorts of computation.

296
00:12:14,810 --> 00:12:18,890
And so in practice, there's no one
best solution to these problems.

297
00:12:18,890 --> 00:12:21,790
But it's good to be thinking about
different ways in which your load

298
00:12:21,790 --> 00:12:24,650
balancer might be operating in
order to think about what algorithm

299
00:12:24,650 --> 00:12:28,369
you might want to use depending on the
specific needs of your web application.

300
00:12:28,369 --> 00:12:30,410
But in general, when we
deal with load balancing,

301
00:12:30,410 --> 00:12:34,190
if we think of this idea of user
tries to access your website,

302
00:12:34,190 --> 00:12:37,220
with every request, that requests
first goes to the load balancer

303
00:12:37,220 --> 00:12:38,930
before it goes to the web server.

304
00:12:38,930 --> 00:12:41,420
And at the load balancer
stage, the load balancer

305
00:12:41,420 --> 00:12:44,750
makes a decision about
send the user to server A

306
00:12:44,750 --> 00:12:47,420
or send the user to
server B. What problems

307
00:12:47,420 --> 00:12:48,831
might occur with just that model?

308
00:12:48,831 --> 00:12:51,830
Even if you don't worry about which
specific algorithm the load balancer

309
00:12:51,830 --> 00:12:55,790
is using to determine where to send the
user each time, what could go wrong?

310
00:12:55,790 --> 00:13:00,640


311
00:13:00,640 --> 00:13:03,615
AUDIENCE: Some users might be
doing more than others on a server.

312
00:13:03,615 --> 00:13:04,240
BRIAN YU: Sure.

313
00:13:04,240 --> 00:13:07,073
So certainly some users might be
doing more than others on a server.

314
00:13:07,073 --> 00:13:10,940
And in particular, when we think about
what users are doing on a server,

315
00:13:10,940 --> 00:13:16,400
the user is oftentimes not just going
to one page and letting it be at that.

316
00:13:16,400 --> 00:13:18,920
A user might be trying to
access a page more than one time

317
00:13:18,920 --> 00:13:22,380
or going to multiple different pages on
the same web application, for instance.

318
00:13:22,380 --> 00:13:27,404
You might imagine on a e-commerce site
like eBay or Amazon, for instance,

319
00:13:27,404 --> 00:13:29,570
a user might be adding
things to their shopping cart

320
00:13:29,570 --> 00:13:32,060
and looking at other pages and adding
new things to their shopping cart

321
00:13:32,060 --> 00:13:34,518
and interacting with a web page
in multiple different ways,

322
00:13:34,518 --> 00:13:35,870
making multiple requests.

323
00:13:35,870 --> 00:13:39,132
And what could go wrong now is
every time a user makes a request,

324
00:13:39,132 --> 00:13:41,840
the load balancer is making a new
decision about send to server A

325
00:13:41,840 --> 00:13:43,088
or send to server B.

326
00:13:43,088 --> 00:13:44,796
AUDIENCE: Yeah, that
would be really bad.

327
00:13:44,796 --> 00:13:48,787
So the load balancer would have to
have some kind of session awareness,

328
00:13:48,787 --> 00:13:49,287
I guess.

329
00:13:49,287 --> 00:13:49,787
Right?

330
00:13:49,787 --> 00:13:53,854
So it send somebody in one server and
it just keep sending that same person.

331
00:13:53,854 --> 00:13:54,520
BRIAN YU: Right.

332
00:13:54,520 --> 00:13:57,478
So a problem might occur where
without-- with just some basic algorithm

333
00:13:57,478 --> 00:13:59,719
like this where every
request we make a decision,

334
00:13:59,719 --> 00:14:01,510
we don't have any sort
of session awareness

335
00:14:01,510 --> 00:14:05,060
that if a user comes into the web
application and is sent to server A,

336
00:14:05,060 --> 00:14:08,440
and we now store the contents of
their shopping cart on server A.

337
00:14:08,440 --> 00:14:10,030
And the user clicks on another page.

338
00:14:10,030 --> 00:14:12,280
And then the load balancer
this time-- either because

339
00:14:12,280 --> 00:14:13,900
of a random choice, a
round robin, or because

340
00:14:13,900 --> 00:14:15,850
of new number has the
fewest connections--

341
00:14:15,850 --> 00:14:18,430
decides to send that
user to server B instead.

342
00:14:18,430 --> 00:14:21,130
That new server might not have--
doesn't have the same session

343
00:14:21,130 --> 00:14:22,900
data that this original server did.

344
00:14:22,900 --> 00:14:26,140
And so maybe now the user's shopping
cart's totally empty, for instance.

345
00:14:26,140 --> 00:14:29,710
And so by introducing
this attempted benefit

346
00:14:29,710 --> 00:14:33,320
of splitting the server into two parts,
horizontally scaling into a server A

347
00:14:33,320 --> 00:14:35,680
and server B, we now
need to worry about when

348
00:14:35,680 --> 00:14:38,560
the user comes to serve A the
first time, what should happen

349
00:14:38,560 --> 00:14:40,060
when they come back the second time.

350
00:14:40,060 --> 00:14:44,080
Maybe we do want the user to
go back to server A again.

351
00:14:44,080 --> 00:14:47,840
So this brings into the
idea of session-aware load

352
00:14:47,840 --> 00:14:49,900
balancing-- this idea
that when we load balance,

353
00:14:49,900 --> 00:14:52,360
it's often going to be
a good idea to make sure

354
00:14:52,360 --> 00:14:55,060
that our load-balancing algorithm
is somehow session-aware,

355
00:14:55,060 --> 00:14:58,150
that it knows that when a
user comes back to the site,

356
00:14:58,150 --> 00:15:00,722
that they should be directed
potentially to the same server.

357
00:15:00,722 --> 00:15:02,680
And that's this first
idea of sticky sessions--

358
00:15:02,680 --> 00:15:05,710
that if user comes to the web
application the first time

359
00:15:05,710 --> 00:15:10,210
and is directed to server A, then when
the user comes back for a second time,

360
00:15:10,210 --> 00:15:13,060
even if random choice
chose server B or even

361
00:15:13,060 --> 00:15:15,150
if based on looking at
number of connections

362
00:15:15,150 --> 00:15:18,580
server B is less loaded than server
A, and we would normally send the user

363
00:15:18,580 --> 00:15:22,330
to server B, we still want to
send that user back to server A

364
00:15:22,330 --> 00:15:24,720
because that's the server
that they were on previously.

365
00:15:24,720 --> 00:15:26,980
That's where all their
session information is.

366
00:15:26,980 --> 00:15:30,577
And so if we want to make sure that the
contents of the user's shopping cart

367
00:15:30,577 --> 00:15:33,160
is preserved, for instance, then
we'd want to continually send

368
00:15:33,160 --> 00:15:35,710
the user back to server A each time.

369
00:15:35,710 --> 00:15:37,690
So that's the idea of sticky sessions.

370
00:15:37,690 --> 00:15:41,380
How else might we deal with the problem
of session-aware load balancing?

371
00:15:41,380 --> 00:15:43,630
Maybe some of these additional
bullets can give you

372
00:15:43,630 --> 00:15:45,296
ideas as to how we might deal with that.

373
00:15:45,296 --> 00:15:50,610


374
00:15:50,610 --> 00:15:53,760
So another possibility here is the
sessions, actually, in the database.

375
00:15:53,760 --> 00:15:56,720
So it's possible that if right
now we're just storing the session

376
00:15:56,720 --> 00:15:59,450
information on the server,
then when we split things up

377
00:15:59,450 --> 00:16:01,720
into two different servers,
server A and server B,

378
00:16:01,720 --> 00:16:05,930
then any session information on
server A isn't accessible on server B.

379
00:16:05,930 --> 00:16:08,180
And so one possibility is
store session information

380
00:16:08,180 --> 00:16:10,670
inside of a database, a
database that potentially

381
00:16:10,670 --> 00:16:15,020
all of the servers, both server A
and server B, both have access to.

382
00:16:15,020 --> 00:16:19,550
And if you do an approach like
that where we store information

383
00:16:19,550 --> 00:16:23,660
about our sessions inside of a database,
rather than just storing them inside

384
00:16:23,660 --> 00:16:26,030
of server A or server
B, then the benefit

385
00:16:26,030 --> 00:16:28,580
there is that no matter which
server the user is sent to,

386
00:16:28,580 --> 00:16:30,140
as long as we have a
way of taking that user

387
00:16:30,140 --> 00:16:32,570
and identifying which session
information in the database

388
00:16:32,570 --> 00:16:35,660
actually belongs to them, then we
can extract that session information

389
00:16:35,660 --> 00:16:39,215
out of the database regardless of
which server of the user went to.

390
00:16:39,215 --> 00:16:41,090
So what would be a
drawback of that approach?

391
00:16:41,090 --> 00:16:44,947
Why might we not want to store
session information in the database?

392
00:16:44,947 --> 00:16:49,817


393
00:16:49,817 --> 00:16:51,877
AUDIENCE: Then have to
scale your database too.

394
00:16:51,877 --> 00:16:52,710
BRIAN YU: Certainly.

395
00:16:52,710 --> 00:16:54,840
Then we start to get into
issues of database scalability,

396
00:16:54,840 --> 00:16:56,970
and we'll talk about
database availability too.

397
00:16:56,970 --> 00:16:58,080
And there's also other--

398
00:16:58,080 --> 00:17:02,114
any time we're introducing additional
hardware, additional servers that

399
00:17:02,114 --> 00:17:04,530
are in play when we're dealing
with issues of scalability,

400
00:17:04,530 --> 00:17:07,857
then we start to incur time costs, that
if originally the session was stored

401
00:17:07,857 --> 00:17:09,690
on the server and now
it's stored elsewhere,

402
00:17:09,690 --> 00:17:11,869
now there's still this communication
time that needs to happen,

403
00:17:11,869 --> 00:17:14,010
this additional latency
that gets added any time

404
00:17:14,010 --> 00:17:15,849
we're trying to access information.

405
00:17:15,849 --> 00:17:18,839
And, finally, you might imagine
that we could store the session not

406
00:17:18,839 --> 00:17:22,260
in our web server at all and rather
use client-side sessions, storing

407
00:17:22,260 --> 00:17:25,230
information, any information
related to the session, actually

408
00:17:25,230 --> 00:17:26,339
inside the client.

409
00:17:26,339 --> 00:17:30,330
And oftentimes this is done through
cookies where web servers can just

410
00:17:30,330 --> 00:17:34,770
take cookies and send them to the
user where inside of the cookie

411
00:17:34,770 --> 00:17:37,260
stores all of the information
related to the session

412
00:17:37,260 --> 00:17:40,590
so that you on your computer
are actually storing inside

413
00:17:40,590 --> 00:17:43,767
of your computer all of the information
about what's in your shopping cart.

414
00:17:43,767 --> 00:17:46,350
And that cookie is sent along
with every web request you make.

415
00:17:46,350 --> 00:17:48,141
So if you make another
web request, doesn't

416
00:17:48,141 --> 00:17:50,010
matter which server you're sent to.

417
00:17:50,010 --> 00:17:51,990
Your web request inside
of the cookie contains

418
00:17:51,990 --> 00:17:54,060
all of the information
that is associated

419
00:17:54,060 --> 00:17:55,800
with that particular section.

420
00:17:55,800 --> 00:18:00,110
And what might be a drawback there?

421
00:18:00,110 --> 00:18:02,550
AUDIENCE: Can that be some
sort of attack on the server

422
00:18:02,550 --> 00:18:07,029
where multiple people start using the
same cookie to overload the server?

423
00:18:07,029 --> 00:18:08,070
BRIAN YU: Great question.

424
00:18:08,070 --> 00:18:10,920
So there's potentially adversarial
ways that this could be used,

425
00:18:10,920 --> 00:18:13,120
that if someone else is
sending the same cookie,

426
00:18:13,120 --> 00:18:15,717
then the server might still
just accept it and assume

427
00:18:15,717 --> 00:18:16,800
that it's the same person.

428
00:18:16,800 --> 00:18:18,540
So it might come in from
different directions.

429
00:18:18,540 --> 00:18:20,500
And, certainly, trying
to overload a server

430
00:18:20,500 --> 00:18:23,010
is something we'll talk about when
we get to the next topic, which

431
00:18:23,010 --> 00:18:26,260
is all about security and how to think
about security in our web applications.

432
00:18:26,260 --> 00:18:28,269
As we begin to scale
them larger and larger,

433
00:18:28,269 --> 00:18:30,810
these security issues start to
become more and more pressing.

434
00:18:30,810 --> 00:18:33,190
So those are definitely
issues to be aware of as well.

435
00:18:33,190 --> 00:18:36,030
So lots of different
ways, ultimately, to deal

436
00:18:36,030 --> 00:18:39,660
with these problems of making sure that
our load balancer is session-aware,

437
00:18:39,660 --> 00:18:42,600
making sure that when the user comes
about that they're consistently

438
00:18:42,600 --> 00:18:44,850
directed either to one
place or another or at least

439
00:18:44,850 --> 00:18:47,760
have some mechanism in
place for making sure

440
00:18:47,760 --> 00:18:50,807
that any session information-- the
contents of the shopping cart or any

441
00:18:50,807 --> 00:18:52,890
notes that they've written
in an application-- get

442
00:18:52,890 --> 00:18:56,670
saved when they go to another
page, when they make another HTTP

443
00:18:56,670 --> 00:18:59,490
request to the web server.

444
00:18:59,490 --> 00:19:02,039
Questions on anything so
far about load balancing

445
00:19:02,039 --> 00:19:03,330
or how we might do any of this?

446
00:19:03,330 --> 00:19:06,220


447
00:19:06,220 --> 00:19:06,720
OK.

448
00:19:06,720 --> 00:19:10,800
So what drawbacks might come about with
regards to just horizontal scaling?

449
00:19:10,800 --> 00:19:13,590
That we say, all right, we
expect that our web server

450
00:19:13,590 --> 00:19:15,750
will need five web servers,
for instance, in order

451
00:19:15,750 --> 00:19:17,650
to deal with traffic on a typical day.

452
00:19:17,650 --> 00:19:21,240
And so now we load balance using
some of these session-aware tools,

453
00:19:21,240 --> 00:19:24,840
of deciding between any of
these five potential servers

454
00:19:24,840 --> 00:19:26,091
that we need to send users to.

455
00:19:26,091 --> 00:19:27,090
And how might that work?

456
00:19:27,090 --> 00:19:28,980
What problems could
come up with that model?

457
00:19:28,980 --> 00:19:42,050


458
00:19:42,050 --> 00:19:46,180
So one thing that I'll talk about
briefly that sort of gets at this idea

459
00:19:46,180 --> 00:19:49,177
is the idea that when we define a
finite number of servers-- and, say,

460
00:19:49,177 --> 00:19:52,510
there are going to be five servers here,
and when a request comes in, it's going

461
00:19:52,510 --> 00:19:54,340
to go to one of those five servers--

462
00:19:54,340 --> 00:19:57,307
well, you never really know
what might happen the next day.

463
00:19:57,307 --> 00:19:59,140
The five servers might
be the typical amount

464
00:19:59,140 --> 00:20:01,420
that you would need in order to
deal with all of the users that

465
00:20:01,420 --> 00:20:02,830
might come in on a given day.

466
00:20:02,830 --> 00:20:05,950
But you might imagine that
some web applications probably

467
00:20:05,950 --> 00:20:08,927
get more traffic at some times of
the day than other times of the day

468
00:20:08,927 --> 00:20:12,010
or even some periods of the year as
compared to other periods of the year.

469
00:20:12,010 --> 00:20:15,130
You, for instance, might imagine
that a shopping website like Amazon

470
00:20:15,130 --> 00:20:17,200
or other online shopping
web sites perhaps

471
00:20:17,200 --> 00:20:19,591
get more traffic when it
comes to the holiday season,

472
00:20:19,591 --> 00:20:22,090
for instance, than when it comes
to other times of the year.

473
00:20:22,090 --> 00:20:26,000
Or you might imagine that a
newspaper website, for instance,

474
00:20:26,000 --> 00:20:29,950
after a big presidential
election or breaking news event,

475
00:20:29,950 --> 00:20:33,040
maybe a lot more people are accessing
that newspaper website as opposed

476
00:20:33,040 --> 00:20:36,206
to during other times of the year when
fewer people are looking at the news.

477
00:20:36,206 --> 00:20:40,546
And so the amount of traffic
that comes into a web application

478
00:20:40,546 --> 00:20:43,670
could vary depending on the time of
day, depending on the time of the year,

479
00:20:43,670 --> 00:20:46,640
depending on random events
that happen from time to time.

480
00:20:46,640 --> 00:20:48,950
And so how might a web
application deal with that?

481
00:20:48,950 --> 00:20:51,070
Of course, just a
finite number of servers

482
00:20:51,070 --> 00:20:54,295
might not be the best solution because
potentially if you underestimate

483
00:20:54,295 --> 00:20:56,170
the amount of maximum
traffic you might need,

484
00:20:56,170 --> 00:20:59,380
then you might get more users than
your servers are able to handle.

485
00:20:59,380 --> 00:21:02,365
And on the flip side, if you just
err on the side of too many servers

486
00:21:02,365 --> 00:21:04,990
and just have a lot of servers
expecting that in the worst case

487
00:21:04,990 --> 00:21:07,840
you might use all of them, then
there is some waste of resources

488
00:21:07,840 --> 00:21:10,480
here, that you're paying, likely, for
all of these different servers that

489
00:21:10,480 --> 00:21:13,270
are running when in reality you
probably don't need that many.

490
00:21:13,270 --> 00:21:17,740
And so autoscaling is a tool that
many cloud computing services now

491
00:21:17,740 --> 00:21:21,460
offer in order to make it such that the
number of servers that you're actually

492
00:21:21,460 --> 00:21:26,200
using can scale depending upon traffic,
that if more and more traffic comes in,

493
00:21:26,200 --> 00:21:29,320
we can scale up the horizontal
scaling of your web application

494
00:21:29,320 --> 00:21:33,430
in order to allow for
more different, more web

495
00:21:33,430 --> 00:21:35,110
servers to be added in those times.

496
00:21:35,110 --> 00:21:37,300
So we might start with
only two web servers,

497
00:21:37,300 --> 00:21:41,042
but if another web server were to come
along, we can add that web server.

498
00:21:41,042 --> 00:21:43,750
And the load balancer knows that
now there are three web servers.

499
00:21:43,750 --> 00:21:45,640
And if traffic increases
even more, we can

500
00:21:45,640 --> 00:21:48,010
continue to scale our web application.

501
00:21:48,010 --> 00:21:51,250
And most cloud computing services,
like Amazon Web Services,

502
00:21:51,250 --> 00:21:54,880
that offer these load balancing
services and autoscaling services,

503
00:21:54,880 --> 00:21:58,010
can allow you to specify here's the
minimum number of servers that I want

504
00:21:58,010 --> 00:21:59,380
and here's the maximum
number of services

505
00:21:59,380 --> 00:22:02,560
that I want and allow the load balancer
to then just make those decisions

506
00:22:02,560 --> 00:22:04,690
about do we need to add
another server or not.

507
00:22:04,690 --> 00:22:09,120
And you can add criteria
for once we reach

508
00:22:09,120 --> 00:22:10,870
a certain threshold
of the number of users

509
00:22:10,870 --> 00:22:13,411
that are trying to access the
site, then might be a good time

510
00:22:13,411 --> 00:22:14,390
to increase the scale.

511
00:22:14,390 --> 00:22:19,270
And if after a period of time of high
usage and utility of your website

512
00:22:19,270 --> 00:22:22,180
traffic begins to die down and you
don't need four servers anymore,

513
00:22:22,180 --> 00:22:23,200
it can scale back down.

514
00:22:23,200 --> 00:22:25,450
It can add new servers
when its needed in order

515
00:22:25,450 --> 00:22:28,780
to adjust based on the demand,
based on the number of users that

516
00:22:28,780 --> 00:22:30,760
are trying to use your web application.

517
00:22:30,760 --> 00:22:33,550
Your web application can
make those decisions about

518
00:22:33,550 --> 00:22:36,220
whether or not we need to
increase the number of servers

519
00:22:36,220 --> 00:22:38,530
or decrease the number of servers.

520
00:22:38,530 --> 00:22:42,450
Questions about any of that so far?

521
00:22:42,450 --> 00:22:43,730
Yes.

522
00:22:43,730 --> 00:22:47,570
AUDIENCE: If you use AWS, do they
take care of the load balancer?

523
00:22:47,570 --> 00:22:48,440
Do they provide it?

524
00:22:48,440 --> 00:22:51,579
Or is that something
that you [INAUDIBLE]??

525
00:22:51,579 --> 00:22:52,620
BRIAN YU: Great question.

526
00:22:52,620 --> 00:22:55,070
So the question is about
how AWS actually does this.

527
00:22:55,070 --> 00:22:58,070
So AWS offers a number
of different services.

528
00:22:58,070 --> 00:23:01,070
Amazon Web Services is just one of
the more popular cloud computing

529
00:23:01,070 --> 00:23:04,870
services used in order to run
servers like these on the internet.

530
00:23:04,870 --> 00:23:07,800
And we'll talk a little bit about
that in just a moment, actually.

531
00:23:07,800 --> 00:23:11,690
But one of those services is
a service that effectively

532
00:23:11,690 --> 00:23:16,137
will allow you to define this
autoscaling group for yourself

533
00:23:16,137 --> 00:23:19,220
in order to say, here are the number
of minimum/maximum number of servers.

534
00:23:19,220 --> 00:23:22,520
And Amazon takes care of the process
of having a load balancer decide

535
00:23:22,520 --> 00:23:26,070
where to send different users and when
to add new servers and when not to.

536
00:23:26,070 --> 00:23:28,880
And other cloud computing
providers like Microsoft Azure,

537
00:23:28,880 --> 00:23:31,520
for instance, they all have very
similar tools and technologies

538
00:23:31,520 --> 00:23:33,436
that allow you to implement
this sort of thing

539
00:23:33,436 --> 00:23:35,820
without you needing to
really worry about that.

540
00:23:35,820 --> 00:23:38,900
And that's all part of this new big
movement towards cloud computing,

541
00:23:38,900 --> 00:23:41,142
that in the past, when
writing a web application

542
00:23:41,142 --> 00:23:43,850
and deploying it and running a
web application for your business,

543
00:23:43,850 --> 00:23:46,599
for instance, you might have needed
to own the servers yourselves,

544
00:23:46,599 --> 00:23:50,150
physically have the servers
inside of your company.

545
00:23:50,150 --> 00:23:52,610
Nowadays, with cloud computing,
this is effectively just

546
00:23:52,610 --> 00:23:55,904
a means to allow you to rent
computing power stored in the cloud,

547
00:23:55,904 --> 00:23:58,070
stored in someone else's
servers, whether it belongs

548
00:23:58,070 --> 00:24:00,890
to Microsoft or Amazon or
someone else, and therefore

549
00:24:00,890 --> 00:24:02,420
allow you to use those resources.

550
00:24:02,420 --> 00:24:06,730
And so what might be a benefit
of this idea of cloud computing,

551
00:24:06,730 --> 00:24:10,550
of using resources from elsewhere
instead of needing to use servers that

552
00:24:10,550 --> 00:24:14,765
are local to wherever you're working?

553
00:24:14,765 --> 00:24:16,965
AUDIENCE: If you're a
small shop, then you

554
00:24:16,965 --> 00:24:20,764
don't need to worry about maintaining
servers, having IT people.

555
00:24:20,764 --> 00:24:21,430
BRIAN YU: Great.

556
00:24:21,430 --> 00:24:25,420
So from a practical perspective, it's
you need-- normally, we need IT people.

557
00:24:25,420 --> 00:24:28,570
You'd have to maintain your own
servers, whereas with the cloud system,

558
00:24:28,570 --> 00:24:31,150
it's typically just a rental
based on the number of hours

559
00:24:31,150 --> 00:24:32,380
of usage of the server.

560
00:24:32,380 --> 00:24:34,720
And Amazon or Microsoft
or whoever takes care

561
00:24:34,720 --> 00:24:36,610
of making sure that the
servers are running,

562
00:24:36,610 --> 00:24:39,776
of maintaining the servers, of dealing
with any problems that might come up.

563
00:24:39,776 --> 00:24:41,770
And so, certainly, there
are practical benefits

564
00:24:41,770 --> 00:24:43,720
that make it logistically
more feasible now

565
00:24:43,720 --> 00:24:46,870
to use cloud computing as the
means of running a web application

566
00:24:46,870 --> 00:24:50,360
rather than having to
physically own your own server.

567
00:24:50,360 --> 00:24:53,080
So now we have this system
in place where we're

568
00:24:53,080 --> 00:24:54,610
trying to scale our web application.

569
00:24:54,610 --> 00:24:57,443
We've talked about vertical scaling
where we just add more computing

570
00:24:57,443 --> 00:24:58,840
power to one particular server.

571
00:24:58,840 --> 00:25:02,080
And then we spent some time talking
about horizontal scaling and the issues

572
00:25:02,080 --> 00:25:06,656
that come into play when suddenly,
instead of just having a single server,

573
00:25:06,656 --> 00:25:09,280
we've had to split things up
across multiple different servers.

574
00:25:09,280 --> 00:25:12,400
And that added challenges of
what do we now do about sessions.

575
00:25:12,400 --> 00:25:15,730
It added challenges of now we need this
additional piece of hardware, this load

576
00:25:15,730 --> 00:25:19,300
balancer here, which is making
decisions on a frequent basis of where

577
00:25:19,300 --> 00:25:22,180
to send user one, two,
three, or four, and then

578
00:25:22,180 --> 00:25:25,510
dealing with how to scale those
servers, as we have to potentially

579
00:25:25,510 --> 00:25:28,990
increase or decrease the number of
servers that we have depending on load.

580
00:25:28,990 --> 00:25:34,630
What happens now if we have four servers
and one of the servers goes offline?

581
00:25:34,630 --> 00:25:36,310
It just stops working.

582
00:25:36,310 --> 00:25:37,557
What could go wrong now?

583
00:25:37,557 --> 00:25:38,890
What might the load balancer do?

584
00:25:38,890 --> 00:25:41,750


585
00:25:41,750 --> 00:25:42,250
Yeah.

586
00:25:42,250 --> 00:25:44,359
AUDIENCE: Getting your
server's replacement.

587
00:25:44,359 --> 00:25:46,150
BRIAN YU: So, certainly,
the end goal would

588
00:25:46,150 --> 00:25:48,882
be to get this fixed, to try
to repair the server, reboot,

589
00:25:48,882 --> 00:25:51,840
restart it, get it back online, or
replace the server if really there's

590
00:25:51,840 --> 00:25:53,750
something physically
wrong with the server.

591
00:25:53,750 --> 00:25:57,784
But in the meantime,
what could go wrong?

592
00:25:57,784 --> 00:26:01,672
AUDIENCE: If a user had an
active session with that server,

593
00:26:01,672 --> 00:26:04,184
then they might lose data
or something like that.

594
00:26:04,184 --> 00:26:04,850
BRIAN YU: Great.

595
00:26:04,850 --> 00:26:06,620
So one potential
problem is that if there

596
00:26:06,620 --> 00:26:09,290
were session data that was
stored only on this one server,

597
00:26:09,290 --> 00:26:12,590
now if a user comes back,
that session data is

598
00:26:12,590 --> 00:26:14,100
no longer accessible potentially.

599
00:26:14,100 --> 00:26:15,950
And so we talked about
possible solutions,

600
00:26:15,950 --> 00:26:18,200
and might deal with that
either by storing the session

601
00:26:18,200 --> 00:26:21,116
inside of the client, so it doesn't
matter what server they end up at,

602
00:26:21,116 --> 00:26:23,570
or storing the session
data inside of a database

603
00:26:23,570 --> 00:26:25,460
somewhere such that it
doesn't matter, again,

604
00:26:25,460 --> 00:26:26,835
which server the user is sent to.

605
00:26:26,835 --> 00:26:29,460
They can still retain that information.

606
00:26:29,460 --> 00:26:31,730
But if we have this idea
of sticky sessions in

607
00:26:31,730 --> 00:26:36,050
place where user goes to the load
balancer, and if they were in session--

608
00:26:36,050 --> 00:26:38,840
if they were in server A before,
they get sent back to server A.

609
00:26:38,840 --> 00:26:40,965
If there were in server B
before they get sent back

610
00:26:40,965 --> 00:26:44,320
to server B. What could happen
is that a user comes along,

611
00:26:44,320 --> 00:26:46,760
hits the load balancer,
and the load balancer says

612
00:26:46,760 --> 00:26:49,490
what server we got last time,
and the user was at server B.

613
00:26:49,490 --> 00:26:53,300
And they try to get sent back to this
server, but the server is now offline.

614
00:26:53,300 --> 00:26:56,610
So somehow we need a way
for our load balancer

615
00:26:56,610 --> 00:26:59,510
to know whether or not these
servers are operational or not,

616
00:26:59,510 --> 00:27:03,110
whether or not it makes sense to
send the user to one of the servers.

617
00:27:03,110 --> 00:27:05,250
So how might we do that?

618
00:27:05,250 --> 00:27:08,120
How might we solve this
problem of we need our load

619
00:27:08,120 --> 00:27:10,180
balancer to know which
of the servers are

620
00:27:10,180 --> 00:27:11,900
online so it knows where to send users?

621
00:27:11,900 --> 00:27:13,650
And we definitely don't
want to be sending

622
00:27:13,650 --> 00:27:18,645
the user to a server that's no longer
running or no longer operational.

623
00:27:18,645 --> 00:27:19,144
Yeah.

624
00:27:19,144 --> 00:27:21,930
AUDIENCE: Just ping the server
to see if you get a response.

625
00:27:21,930 --> 00:27:23,310
BRIAN YU: Ping the server,
see if you get a response.

626
00:27:23,310 --> 00:27:24,120
Certainly.

627
00:27:24,120 --> 00:27:26,850
And so one variant on
this idea that's often

628
00:27:26,850 --> 00:27:29,130
used when it comes towards
dealing with these servers

629
00:27:29,130 --> 00:27:33,090
is this idea of having each server
give off a heartbeat, just a signal

630
00:27:33,090 --> 00:27:36,220
that they produce every so
often, where the signals are

631
00:27:36,220 --> 00:27:37,470
received by the load balancer.

632
00:27:37,470 --> 00:27:39,330
And the load balancer knows if
it's hearing those heartbeats,

633
00:27:39,330 --> 00:27:40,950
then the servers are operational.

634
00:27:40,950 --> 00:27:43,770
And if too long goes by without
hearing one of those heartbeats

635
00:27:43,770 --> 00:27:46,320
from this server, for instance,
then the load balancer

636
00:27:46,320 --> 00:27:49,770
can reasonably guess that maybe that
server is no longer operational.

637
00:27:49,770 --> 00:27:53,310
Maybe it should be we should
no longer be sending users

638
00:27:53,310 --> 00:27:55,290
to those servers in particular.

639
00:27:55,290 --> 00:27:58,500
And that brings into account its
own design decisions of how frequent

640
00:27:58,500 --> 00:28:00,009
do you want those heartbeats to be.

641
00:28:00,009 --> 00:28:01,800
Certainly, if they're
more frequent, you're

642
00:28:01,800 --> 00:28:05,560
getting a better sense of the frequency
to which the servers are running.

643
00:28:05,560 --> 00:28:09,510
And you know more instantaneously when
a server potentially goes offline.

644
00:28:09,510 --> 00:28:12,420
And if those heartbeats
are less frequent, then

645
00:28:12,420 --> 00:28:14,670
maybe you're saving on
energy because you no longer

646
00:28:14,670 --> 00:28:17,010
need to continuously
compute whether or not

647
00:28:17,010 --> 00:28:20,520
you're receiving all of these heartbeats
coming from all the different servers.

648
00:28:20,520 --> 00:28:23,980
And so, again, with all of the
decisions that we make in scalability,

649
00:28:23,980 --> 00:28:25,950
there's not necessarily
one correct decision

650
00:28:25,950 --> 00:28:28,300
that this is the right
way to do a load balancer.

651
00:28:28,300 --> 00:28:30,660
But there are trade-offs
with each of the decisions

652
00:28:30,660 --> 00:28:32,610
that we make with regards
to how many servers

653
00:28:32,610 --> 00:28:36,420
we have, with regards to the algorithm
that we choose for our load balancer,

654
00:28:36,420 --> 00:28:38,820
with regards to how we choose
to decide whether or not

655
00:28:38,820 --> 00:28:40,860
these servers are offline or not.

656
00:28:40,860 --> 00:28:44,730
And when a server is offline, we need to
put some thought into how do we, then,

657
00:28:44,730 --> 00:28:46,500
from the perspective
of the load balancer,

658
00:28:46,500 --> 00:28:48,708
decide that we're no longer
going to be sending users

659
00:28:48,708 --> 00:28:50,910
to that server that's now offline.

660
00:28:50,910 --> 00:28:53,070
And so all those
concerns start to come up

661
00:28:53,070 --> 00:28:58,612
when we start to deal with this idea
of trying to scale our web application.

662
00:28:58,612 --> 00:29:00,070
Questions about any of that so far?

663
00:29:00,070 --> 00:29:02,930


664
00:29:02,930 --> 00:29:03,781
OK.

665
00:29:03,781 --> 00:29:05,530
We'll go ahead and
take a break right now.

666
00:29:05,530 --> 00:29:07,470
And we'll come back later and talk
about some other concerns that

667
00:29:07,470 --> 00:29:10,740
come about with regards to scalability,
including talking about databases

668
00:29:10,740 --> 00:29:14,002
and what happens here with this image
so far that we only have servers now.

669
00:29:14,002 --> 00:29:16,710
But what happens if we start to
integrate databases into the mix?

670
00:29:16,710 --> 00:29:19,110
And how do we deal with
scalability there as well?

671
00:29:19,110 --> 00:29:23,710


672
00:29:23,710 --> 00:29:26,070
So before the break, we
were talking about how

673
00:29:26,070 --> 00:29:29,190
we would go about scaling applications,
either via vertical scaling

674
00:29:29,190 --> 00:29:30,497
or horizontal scaling.

675
00:29:30,497 --> 00:29:32,580
And when we were talking
about horizontal scaling,

676
00:29:32,580 --> 00:29:35,760
we talked about this idea
of splitting up and rather

677
00:29:35,760 --> 00:29:38,400
than just having one server,
having multiple different servers

678
00:29:38,400 --> 00:29:40,800
with a load balancer
that can then decide

679
00:29:40,800 --> 00:29:44,820
whether to send the user to server A or
whether to send the user to server B.

680
00:29:44,820 --> 00:29:47,700
What we didn't quite talk about
was how the load balancer is

681
00:29:47,700 --> 00:29:51,240
able to implement this idea of
sticky sessions, the idea of when

682
00:29:51,240 --> 00:29:53,700
the user comes along, if they
were at server A last time,

683
00:29:53,700 --> 00:29:56,670
we want to send them back to server A.
And if they were at server B last time,

684
00:29:56,670 --> 00:29:58,200
we want to send them to server B.

685
00:29:58,200 --> 00:30:02,950
So what are some ways by which
we can actually make that happen?

686
00:30:02,950 --> 00:30:07,080
How can the load balancer consistently
send the same user to the same server

687
00:30:07,080 --> 00:30:12,040
every time in order to make sure that if
the user's shopping cart's on server A,

688
00:30:12,040 --> 00:30:14,400
that we don't inadvertently
send the user to server B

689
00:30:14,400 --> 00:30:18,037
and they lose all the content of their
shopping cart data, for instance?

690
00:30:18,037 --> 00:30:19,620
What might be some ways of doing that?

691
00:30:19,620 --> 00:30:24,003


692
00:30:24,003 --> 00:30:26,925
AUDIENCE: Could it have
its own session tracking?

693
00:30:26,925 --> 00:30:29,465
It could send the person
a cookie or something or--

694
00:30:29,465 --> 00:30:30,090
BRIAN YU: Good.

695
00:30:30,090 --> 00:30:31,740
It could send the person
a cookie, for instance.

696
00:30:31,740 --> 00:30:32,240
Great.

697
00:30:32,240 --> 00:30:35,340
So inside of the cookie, maybe-- the
cookie that the load balancer can

698
00:30:35,340 --> 00:30:39,814
set when the response goes back
to the user is one that determines

699
00:30:39,814 --> 00:30:41,730
or that has some information
inside of it that

700
00:30:41,730 --> 00:30:45,045
says users should go to server A, for
instance, or user should go to server

701
00:30:45,045 --> 00:30:49,810
B. And so these cookies are often a very
useful way, whether it's for the server

702
00:30:49,810 --> 00:30:52,895
or for the load balancer, of
giving information to the client

703
00:30:52,895 --> 00:30:55,020
that when the client tries
to make another request,

704
00:30:55,020 --> 00:30:56,460
that information is still there.

705
00:30:56,460 --> 00:30:59,940
And we talked about before
the idea that one possible way

706
00:30:59,940 --> 00:31:03,570
of implementing this idea of making
sure that the session stays consistent

707
00:31:03,570 --> 00:31:06,840
regardless of what happens
with the horizontal scaling

708
00:31:06,840 --> 00:31:10,350
is to actually store session
information inside of the cookie.

709
00:31:10,350 --> 00:31:13,150
And this is something that
Flask actually does by default.

710
00:31:13,150 --> 00:31:15,000
So we've been using
Flask for a while now

711
00:31:15,000 --> 00:31:18,690
in order to write web applications
that have sessions that

712
00:31:18,690 --> 00:31:20,460
are storing information about the user.

713
00:31:20,460 --> 00:31:22,350
And by default, Flask
will use what's called

714
00:31:22,350 --> 00:31:27,467
a signed cookie, this idea that when
the user has their session information,

715
00:31:27,467 --> 00:31:30,300
we're just going to put that session
information inside of a cookie.

716
00:31:30,300 --> 00:31:33,600
But what might be the problem of just
taking all the session information,

717
00:31:33,600 --> 00:31:36,030
putting it inside of a
cookie, and then just using

718
00:31:36,030 --> 00:31:39,150
that as the way via which
users are interacting

719
00:31:39,150 --> 00:31:42,330
with sessions on your web application?

720
00:31:42,330 --> 00:31:46,200
Your session, for instance, might just
be a Python dictionary, you imagine,

721
00:31:46,200 --> 00:31:48,600
that contains the
user's user ID and maybe

722
00:31:48,600 --> 00:31:51,540
the information of what's currently
inside that user's shopping cart.

723
00:31:51,540 --> 00:31:53,498
And if that's just inside
of a cookie that gets

724
00:31:53,498 --> 00:31:56,927
sent back and forth between
the server and the client,

725
00:31:56,927 --> 00:31:58,010
what could go wrong there?

726
00:31:58,010 --> 00:32:05,770


727
00:32:05,770 --> 00:32:08,632
AUDIENCE: Are there limits on how
much you can fit in the cookie?

728
00:32:08,632 --> 00:32:10,590
BRIAN YU: So, certainly,
the size of the cookie

729
00:32:10,590 --> 00:32:13,712
is something to bear in mind,
that a cookie could potentially--

730
00:32:13,712 --> 00:32:15,420
as the cookies get
larger and larger, now

731
00:32:15,420 --> 00:32:18,086
you have to start to worry about
cookie, the size of the cookie,

732
00:32:18,086 --> 00:32:21,330
and the amount of latency it'll take to
send that consistently back and forth

733
00:32:21,330 --> 00:32:22,860
between the client and the server.

734
00:32:22,860 --> 00:32:24,870
Are there any security
concerns we can think

735
00:32:24,870 --> 00:32:28,970
of that could come up if all
we're doing is just sending--

736
00:32:28,970 --> 00:32:32,970
if the server sends back the session
information that contains a user ID

737
00:32:32,970 --> 00:32:35,580
and what's inside the
cart, and we just expect

738
00:32:35,580 --> 00:32:37,772
that when the user
sends back that cookie,

739
00:32:37,772 --> 00:32:39,855
that will be the information
that the server knows

740
00:32:39,855 --> 00:32:43,254
is what's contained
inside the user's session.

741
00:32:43,254 --> 00:32:45,462
AUDIENCE: I suppose somebody
could steal your cookie,

742
00:32:45,462 --> 00:32:48,885
and then they would have access
to whatever you have access to

743
00:32:48,885 --> 00:32:50,495
[INAUDIBLE].

744
00:32:50,495 --> 00:32:51,120
BRIAN YU: Sure.

745
00:32:51,120 --> 00:32:52,530
Certainly, someone
could steal the cookie.

746
00:32:52,530 --> 00:32:55,270
And if they were able to steal the
cookie and gain access to that cookie,

747
00:32:55,270 --> 00:32:57,395
then they would have access
to your entire account.

748
00:32:57,395 --> 00:33:00,660
They could log in as you, and they
could see the contents of whatever

749
00:33:00,660 --> 00:33:02,280
was in your cart at that time.

750
00:33:02,280 --> 00:33:05,160
What about even if you didn't have
access to someone else's cookie?

751
00:33:05,160 --> 00:33:09,420
Can you imagine a world where in this
very simple-- not very secure-- example

752
00:33:09,420 --> 00:33:12,927
where we're just sending the
cookie back and forth, where

753
00:33:12,927 --> 00:33:14,760
things could go wrong,
where you could still

754
00:33:14,760 --> 00:33:16,301
get access to someone else's account?

755
00:33:16,301 --> 00:33:20,570


756
00:33:20,570 --> 00:33:23,380
So if we're just relying on the
contents of the cookie for-- yeah.

757
00:33:23,380 --> 00:33:24,180
Go ahead.

758
00:33:24,180 --> 00:33:27,540
AUDIENCE: Take that cookie and send
it yourself so you can pretend to be

759
00:33:27,540 --> 00:33:28,574
[INAUDIBLE].

760
00:33:28,574 --> 00:33:29,240
BRIAN YU: Great.

761
00:33:29,240 --> 00:33:31,840
So you could try and pretend to
be someone else, effectively.

762
00:33:31,840 --> 00:33:35,252
If you were able to take the cookie and
change what the value of the user ID

763
00:33:35,252 --> 00:33:36,960
is for instance and
try and send it back,

764
00:33:36,960 --> 00:33:38,824
then that potentially
is an attack vector

765
00:33:38,824 --> 00:33:41,740
by which you could trick the server
into thinking that you are someone

766
00:33:41,740 --> 00:33:42,650
that you're not.

767
00:33:42,650 --> 00:33:45,940
And so one way that Flask tries to get
around this is by signing the cookies.

768
00:33:45,940 --> 00:33:48,130
And so if you want to
use Flask signed cookies,

769
00:33:48,130 --> 00:33:51,932
you'll have to include a private key
inside of the web application, which

770
00:33:51,932 --> 00:33:53,890
is just going to be a
long string of characters

771
00:33:53,890 --> 00:33:56,680
that only the web application
should know and shouldn't

772
00:33:56,680 --> 00:33:58,130
be accessible to users.

773
00:33:58,130 --> 00:34:01,451
And, effectively, every time
Flask sends you a cookie,

774
00:34:01,451 --> 00:34:04,450
it's going to sign that cookie, add
a signature, where that signature is

775
00:34:04,450 --> 00:34:09,550
going to be generated based on
a combination of the contents

776
00:34:09,550 --> 00:34:13,120
of the session itself and of
what the private key is in order

777
00:34:13,120 --> 00:34:16,449
to generate a signature that
shouldn't be or should be reasonably--

778
00:34:16,449 --> 00:34:18,699
should be difficult for
anyone to be able to predict

779
00:34:18,699 --> 00:34:22,130
or figure out such that you can know
with confidence when you get back

780
00:34:22,130 --> 00:34:24,229
that session, Flask can
treat it as a checksum,

781
00:34:24,229 --> 00:34:26,770
effectively, in order to determine,
in fact, that this cookie

782
00:34:26,770 --> 00:34:27,850
did come from this user.

783
00:34:27,850 --> 00:34:30,070
It is, in fact, a valid,
genuine cookie, and they

784
00:34:30,070 --> 00:34:31,750
can trust the information inside of it.

785
00:34:31,750 --> 00:34:34,250
But, certainly, with the issues
we talked about with regards

786
00:34:34,250 --> 00:34:37,287
to cookies and the potential for
them to be intercepted and used,

787
00:34:37,287 --> 00:34:39,370
we might not want to use
that as our method, which

788
00:34:39,370 --> 00:34:42,453
is why in the Flask applications we've
been building, if you've noticed up

789
00:34:42,453 --> 00:34:46,429
at the top, we've set some application
settings inside of the Flask app

790
00:34:46,429 --> 00:34:49,780
variable that actually say that when
we're using these sessions, rather

791
00:34:49,780 --> 00:34:52,969
than use cookies as their
means for storing sessions,

792
00:34:52,969 --> 00:34:55,090
we've been using sessions
that are actually

793
00:34:55,090 --> 00:34:57,370
stored on the file system
of the server itself

794
00:34:57,370 --> 00:34:59,350
as your way of tracking the sessions.

795
00:34:59,350 --> 00:35:01,600
And, in fact, if you
were to ever use sessions

796
00:35:01,600 --> 00:35:04,180
on Flask, using those
files system sessions,

797
00:35:04,180 --> 00:35:06,415
and shut off the Flask
server or even just--

798
00:35:06,415 --> 00:35:08,290
even if you didn't shut
off the Flask server,

799
00:35:08,290 --> 00:35:12,430
you could look at the contents
of the sessions directory

800
00:35:12,430 --> 00:35:15,460
in order to take a look at what
the sessions actually look like,

801
00:35:15,460 --> 00:35:18,670
which can be interesting to explore
if you want to get a sense for what's

802
00:35:18,670 --> 00:35:21,310
going on inside of the sessions.

803
00:35:21,310 --> 00:35:25,750
And so we spent some time today talking
about trying to scale up these servers.

804
00:35:25,750 --> 00:35:28,060
But one thing we've come
back to a number of times

805
00:35:28,060 --> 00:35:30,950
is databases and what happens
when we're trying to store data,

806
00:35:30,950 --> 00:35:32,290
whether it's session
data that we might be

807
00:35:32,290 --> 00:35:33,850
trying to store inside of a database.

808
00:35:33,850 --> 00:35:36,190
Or maybe it's just that our
application uses a database,

809
00:35:36,190 --> 00:35:38,230
whether it's in project
one or project three,

810
00:35:38,230 --> 00:35:41,560
where we've wanted to store books
or food orders inside of a database.

811
00:35:41,560 --> 00:35:44,050
What happens when multiple
different servers are

812
00:35:44,050 --> 00:35:46,790
trying to access that same database?

813
00:35:46,790 --> 00:35:50,720
And now we start to get into this issue
of trying to scale up our databases.

814
00:35:50,720 --> 00:35:53,180
So we might imagine that--
we'll take the same picture.

815
00:35:53,180 --> 00:35:54,280
We've got a load balancer.

816
00:35:54,280 --> 00:35:55,420
We've got two servers.

817
00:35:55,420 --> 00:35:58,420
And now we also want those servers
interacting and communicating

818
00:35:58,420 --> 00:36:00,490
with a database somewhere,
where those servers

819
00:36:00,490 --> 00:36:02,680
are communicating with the database.

820
00:36:02,680 --> 00:36:05,660
What can go wrong now in
this picture with this model?

821
00:36:05,660 --> 00:36:08,648


822
00:36:08,648 --> 00:36:11,015
AUDIENCE: Well, too much
load on your database maybe.

823
00:36:11,015 --> 00:36:11,640
BRIAN YU: Yeah.

824
00:36:11,640 --> 00:36:12,889
Too much load on the database.

825
00:36:12,889 --> 00:36:16,800
You might imagine that if the reason
we went from one server to two servers

826
00:36:16,800 --> 00:36:19,590
was because a single server
wasn't enough in order

827
00:36:19,590 --> 00:36:22,650
to handle all the load, all the traffic
coming into that one server, then

828
00:36:22,650 --> 00:36:25,170
if we have all the load from
both servers that are all

829
00:36:25,170 --> 00:36:27,378
trying to talk to the same
database at the same time,

830
00:36:27,378 --> 00:36:29,490
we might be potentially
overloading that database.

831
00:36:29,490 --> 00:36:30,969
And that might become unmanageable.

832
00:36:30,969 --> 00:36:32,010
What else could go wrong?

833
00:36:32,010 --> 00:36:35,475


834
00:36:35,475 --> 00:36:37,460
AUDIENCE: Database server could go down.

835
00:36:37,460 --> 00:36:38,660
BRIAN YU: The database
server could go down.

836
00:36:38,660 --> 00:36:38,900
Great.

837
00:36:38,900 --> 00:36:40,260
That's another thing that could happen.

838
00:36:40,260 --> 00:36:43,370
So we've talked about this idea that
when we were scaling our servers

839
00:36:43,370 --> 00:36:46,886
and had multiple different servers,
if one server goes down, no big deal.

840
00:36:46,886 --> 00:36:49,760
So long as the load balancer knows
that server two is the server that

841
00:36:49,760 --> 00:36:51,590
went down, it can
redirect all the traffic

842
00:36:51,590 --> 00:36:53,540
to servers one, three, four, and five.

843
00:36:53,540 --> 00:36:57,050
But here we see that this is what we
might call a single point of failure,

844
00:36:57,050 --> 00:37:00,770
a place inside of our diagram
of all the hardware that's

845
00:37:00,770 --> 00:37:04,141
going on where if this one thing fails,
then the entire web application breaks.

846
00:37:04,141 --> 00:37:04,640
Right?

847
00:37:04,640 --> 00:37:07,250
If the database fails, nothing
else in the application

848
00:37:07,250 --> 00:37:10,250
is going to be able to work, assuming
the web application is relying

849
00:37:10,250 --> 00:37:12,170
on the data and the database to work.

850
00:37:12,170 --> 00:37:15,480
Whereas this server, for instance,
wouldn't be a single point of failure

851
00:37:15,480 --> 00:37:18,440
because if this server goes
down, then the load balancer

852
00:37:18,440 --> 00:37:20,150
can just direct all
users and all traffic

853
00:37:20,150 --> 00:37:23,552
to this server over here, which
can still access the database.

854
00:37:23,552 --> 00:37:25,760
And for that matter, the
load balancer itself is also

855
00:37:25,760 --> 00:37:27,093
another single point of failure.

856
00:37:27,093 --> 00:37:29,511
If the load balancer goes
down, then suddenly we

857
00:37:29,511 --> 00:37:32,010
have no way of directing users
to various different servers.

858
00:37:32,010 --> 00:37:33,410
And so we might think
that there might be ways

859
00:37:33,410 --> 00:37:36,330
that we want to have multiple
load balancers, for instance,

860
00:37:36,330 --> 00:37:38,690
in order to try to address
that problem of avoiding

861
00:37:38,690 --> 00:37:40,320
having single points of failure.

862
00:37:40,320 --> 00:37:42,890
But what we're going to
focus on now is on ways

863
00:37:42,890 --> 00:37:45,430
to make this database
scaling more manageable.

864
00:37:45,430 --> 00:37:47,930
That as more and more data
starts to come into our database,

865
00:37:47,930 --> 00:37:50,930
we might start to see slower
queries because if we have millions

866
00:37:50,930 --> 00:37:52,720
upon millions of rows
in our database, it

867
00:37:52,720 --> 00:37:55,595
might take longer and longer in
order to query that database in order

868
00:37:55,595 --> 00:37:56,970
to get the data that we want.

869
00:37:56,970 --> 00:38:01,350
So how do, as we scale up our
applications, begin to deal with that?

870
00:38:01,350 --> 00:38:03,350
And so the first topic
we're going to talk about

871
00:38:03,350 --> 00:38:08,000
is database partitioning, the idea
that if we have database tables that

872
00:38:08,000 --> 00:38:11,540
are large, either large in a number of
rows or large in a number of columns,

873
00:38:11,540 --> 00:38:14,810
then trying to query information
from those big tables

874
00:38:14,810 --> 00:38:16,940
can start to get complicated.

875
00:38:16,940 --> 00:38:18,770
And it can start to
become time-consuming.

876
00:38:18,770 --> 00:38:22,130
That if we have large tables, it's going
to take more and more time in order

877
00:38:22,130 --> 00:38:23,000
to query them.

878
00:38:23,000 --> 00:38:26,630
And so databased partitioning
is going to represent the idea

879
00:38:26,630 --> 00:38:28,850
that if we have data
inside of our database,

880
00:38:28,850 --> 00:38:32,960
we can often split up that data
into multiple different parts--

881
00:38:32,960 --> 00:38:35,540
into multiple different
tables, for instance-- in order

882
00:38:35,540 --> 00:38:38,690
to better allow for ourselves to
deal with more manageable units,

883
00:38:38,690 --> 00:38:41,990
to have queries on those tables run
more efficiently and more quickly,

884
00:38:41,990 --> 00:38:46,170
and in that sense help us as we begin
to scale up our web application.

885
00:38:46,170 --> 00:38:49,890
And so one form of database partitioning
we've actually already seen.

886
00:38:49,890 --> 00:38:52,200
It's called vertical
database partitioning.

887
00:38:52,200 --> 00:38:54,200
And the idea of vertical
database partitioning--

888
00:38:54,200 --> 00:38:56,480
if you remember this from way
back in one of the earlier weeks

889
00:38:56,480 --> 00:38:57,320
of the lecture--

890
00:38:57,320 --> 00:38:59,960
is the idea that in vertical
database partitioning

891
00:38:59,960 --> 00:39:03,980
we're going to separate our table
into multiple different tables

892
00:39:03,980 --> 00:39:06,530
by decreasing the number
of columns in those tables

893
00:39:06,530 --> 00:39:09,410
by separating things out
such that some columns are

894
00:39:09,410 --> 00:39:12,660
going to be put into a different
table than other columns.

895
00:39:12,660 --> 00:39:14,895
So if we recall, this
original table of flights,

896
00:39:14,895 --> 00:39:18,020
which we were keeping track of when we
were first trying to think about SQL

897
00:39:18,020 --> 00:39:20,330
and how we might organize
data inside of a database,

898
00:39:20,330 --> 00:39:25,327
we have each flight having an ID number,
an origin, an origin airline code,

899
00:39:25,327 --> 00:39:28,160
a destination, the destination code,
and the duration of the flight,

900
00:39:28,160 --> 00:39:29,540
for instance.

901
00:39:29,540 --> 00:39:31,610
And what we did when
we were first talking

902
00:39:31,610 --> 00:39:33,740
about SQL and the idea
of designing tables

903
00:39:33,740 --> 00:39:38,030
was to use foreign keys as our way
of what we'll now call vertically

904
00:39:38,030 --> 00:39:39,540
partitioning this database.

905
00:39:39,540 --> 00:39:42,920
And instead of storing all the data
like this inside of a big flights

906
00:39:42,920 --> 00:39:44,830
table that might be
expensive to query, we

907
00:39:44,830 --> 00:39:46,580
can split it up into
two different tables.

908
00:39:46,580 --> 00:39:49,790
Split it up into a locations
table where each location just

909
00:39:49,790 --> 00:39:52,490
has its independent code and the name.

910
00:39:52,490 --> 00:39:55,580
And we can split it up into a
flights table where each flight,

911
00:39:55,580 --> 00:39:59,420
rather than have all of those columns
as before, now only has four columns.

912
00:39:59,420 --> 00:40:00,770
It's got an ID column.

913
00:40:00,770 --> 00:40:04,760
It's got a number that represents the
origin, or inside of the locations

914
00:40:04,760 --> 00:40:08,750
table, origin one corresponds to
location number one in the locations

915
00:40:08,750 --> 00:40:09,294
table.

916
00:40:09,294 --> 00:40:11,210
And, likewise, we have
a destination_id, where

917
00:40:11,210 --> 00:40:16,140
destination_id four corresponds to
this particular location in London,

918
00:40:16,140 --> 00:40:17,360
and, finally, a duration.

919
00:40:17,360 --> 00:40:19,550
And so we factored out
some of the columns

920
00:40:19,550 --> 00:40:22,160
in order to create tables that
have fewer columns and are,

921
00:40:22,160 --> 00:40:26,424
therefore, more manageable and might
be easier to query in some sense.

922
00:40:26,424 --> 00:40:28,340
And so this is vertical
partitioning, and it's

923
00:40:28,340 --> 00:40:29,944
something we've already seen before.

924
00:40:29,944 --> 00:40:33,110
But there's another form of partitioning
as well, which is, you might guess,

925
00:40:33,110 --> 00:40:35,030
is called horizontal partitioning.

926
00:40:35,030 --> 00:40:37,340
And that might look something like this.

927
00:40:37,340 --> 00:40:41,354
If our table of flight is just getting
too long, getting too big to query,

928
00:40:41,354 --> 00:40:44,270
where we're consistently having to
run queries on the set of flights--

929
00:40:44,270 --> 00:40:48,780
looking for all flights that are
going from New York to San Francisco,

930
00:40:48,780 --> 00:40:49,830
for example--

931
00:40:49,830 --> 00:40:53,030
and those queries are
starting to take a long time,

932
00:40:53,030 --> 00:40:56,130
we might horizontally
partition our table.

933
00:40:56,130 --> 00:40:59,240
In horizontal partitioning, rather
than change the number of rows

934
00:40:59,240 --> 00:41:01,200
to have fewer rows in
each of our tables,

935
00:41:01,200 --> 00:41:03,820
we're just going to split
up the rows of our tables.

936
00:41:03,820 --> 00:41:06,357
Or rather than change the
number of columns in the tables,

937
00:41:06,357 --> 00:41:08,690
we're going to split up the
rows of our table such that,

938
00:41:08,690 --> 00:41:13,362
rather than have a table that
has 2,000 rows, for instance,

939
00:41:13,362 --> 00:41:16,070
we might have two different tables
where we put 1,000 rows in one

940
00:41:16,070 --> 00:41:18,450
table and 1,000 rows in another table.

941
00:41:18,450 --> 00:41:21,230
And so we might take this
idea of the flights table

942
00:41:21,230 --> 00:41:24,170
and really split it up into two
different tables, a domestic flights

943
00:41:24,170 --> 00:41:27,560
table and an international flights
table, where each one of these tables

944
00:41:27,560 --> 00:41:28,970
contains the same columns.

945
00:41:28,970 --> 00:41:32,840
It's still going to have an ID, an
origin, a destination, and a duration.

946
00:41:32,840 --> 00:41:35,890
It's just that we're going to
split up the flights into those two

947
00:41:35,890 --> 00:41:37,300
independent tables.

948
00:41:37,300 --> 00:41:39,580
What benefit do we get by doing this?

949
00:41:39,580 --> 00:41:42,010
What advantage do we get
by taking the flights table

950
00:41:42,010 --> 00:41:44,740
and partitioning it into two
different tables, a domestic

951
00:41:44,740 --> 00:41:48,640
and an international table, that we
didn't have with just the flights

952
00:41:48,640 --> 00:41:50,835
table?

953
00:41:50,835 --> 00:41:51,335
Yeah.

954
00:41:51,335 --> 00:41:54,730
AUDIENCE: You're going through less
rows, so if you split it, the table,

955
00:41:54,730 --> 00:41:59,684
in half, you're spending half the
time [INAUDIBLE] to the database.

956
00:41:59,684 --> 00:42:00,350
BRIAN YU: Great.

957
00:42:00,350 --> 00:42:02,570
So the big benefit is that
our queries are faster,

958
00:42:02,570 --> 00:42:05,210
that if I'm trying to
query a domestic flight,

959
00:42:05,210 --> 00:42:08,060
I now only need to search through
this domestic flights table.

960
00:42:08,060 --> 00:42:09,290
And I don't need to
worry about searching

961
00:42:09,290 --> 00:42:11,790
through however many international
flights there might be,

962
00:42:11,790 --> 00:42:13,460
and so my queries can become faster.

963
00:42:13,460 --> 00:42:16,372
And horizontal partitioning
oftentimes isn't just with two tables.

964
00:42:16,372 --> 00:42:18,830
You might split things up into
many, many different tables.

965
00:42:18,830 --> 00:42:23,690
You might imagine that if
you have a database that's

966
00:42:23,690 --> 00:42:26,854
keeping track of different people's
addresses and locations inside

967
00:42:26,854 --> 00:42:28,770
of the country, that you
might split things up

968
00:42:28,770 --> 00:42:31,437
into having 50 different tables--
one for each of the US states,

969
00:42:31,437 --> 00:42:34,269
where if you're trying to find
someone where you know that they live

970
00:42:34,269 --> 00:42:36,500
in Oregon, for example, you
can just query that table

971
00:42:36,500 --> 00:42:38,791
and ignore the tables that
have to do with anyone else,

972
00:42:38,791 --> 00:42:40,850
thereby speeding up that query.

973
00:42:40,850 --> 00:42:44,660
What drawbacks, though, come with this
approach of horizontally partitioning

974
00:42:44,660 --> 00:42:47,840
our data into multiple different
tables rather than keeping it

975
00:42:47,840 --> 00:42:51,330
all inside of the same table?

976
00:42:51,330 --> 00:42:51,830
Yeah.

977
00:42:51,830 --> 00:42:53,450
AUDIENCE: It seems like
your code would have

978
00:42:53,450 --> 00:42:55,449
to get more complicated
because you have to know

979
00:42:55,449 --> 00:42:56,835
which table to look in for what.

980
00:42:56,835 --> 00:42:57,460
BRIAN YU: Sure.

981
00:42:57,460 --> 00:42:59,168
So there's some code
complexity that gets

982
00:42:59,168 --> 00:43:01,510
added here, that we need to
now know before we query.

983
00:43:01,510 --> 00:43:03,400
We can't just say query
the flights table.

984
00:43:03,400 --> 00:43:05,050
We need to have some
mechanism for knowing, yeah,

985
00:43:05,050 --> 00:43:06,940
should we query the
domestic flights table,

986
00:43:06,940 --> 00:43:08,890
or should we query the
international flights table.

987
00:43:08,890 --> 00:43:11,681
And maybe that in itself is going
to be an expensive process, which

988
00:43:11,681 --> 00:43:13,660
is why oftentimes it's
good, if you're going

989
00:43:13,660 --> 00:43:16,240
to do any sort of horizontal
database partitioning,

990
00:43:16,240 --> 00:43:19,752
to give some thought as to how your
partitioning that data, making sure

991
00:43:19,752 --> 00:43:22,960
that it's a way that you're going to be
able to quickly and easily figure out

992
00:43:22,960 --> 00:43:25,180
this is the table that I
need to query as opposed

993
00:43:25,180 --> 00:43:28,138
to having to spend a long time trying
to figure out which of the tables

994
00:43:28,138 --> 00:43:30,340
to query before you actually do.

995
00:43:30,340 --> 00:43:32,630
Other potential drawbacks
of this approach?

996
00:43:32,630 --> 00:43:35,994
AUDIENCE: If you make a schema change
to one, you have to do it to the others.

997
00:43:35,994 --> 00:43:36,660
BRIAN YU: Great.

998
00:43:36,660 --> 00:43:39,035
So schema changes now become
a little more of a headache,

999
00:43:39,035 --> 00:43:41,582
that if I'm changing the
schema for this flights table,

1000
00:43:41,582 --> 00:43:43,290
now I suddenly need
to worry about-- now,

1001
00:43:43,290 --> 00:43:46,331
instead of just changing one table,
I need to update both tables in order

1002
00:43:46,331 --> 00:43:48,772
to reflect those changes.

1003
00:43:48,772 --> 00:43:51,480
Other things that could go wrong
with this approach or trade-offs

1004
00:43:51,480 --> 00:43:53,850
that we have to
sacrifice in order to get

1005
00:43:53,850 --> 00:43:57,260
the benefit of this
additional query speed?

1006
00:43:57,260 --> 00:43:58,490
We'll put it this way-- yeah.

1007
00:43:58,490 --> 00:43:59,365
Go ahead.

1008
00:43:59,365 --> 00:44:00,490
AUDIENCE: Maybe validation.

1009
00:44:00,490 --> 00:44:03,564
You might have duplicates
and multiples tables.

1010
00:44:03,564 --> 00:44:04,230
BRIAN YU: Great.

1011
00:44:04,230 --> 00:44:06,870
So as soon as we start to deal
with multiple tables, then

1012
00:44:06,870 --> 00:44:09,150
there's potential for
invalid data that you

1013
00:44:09,150 --> 00:44:12,240
might need to worry about making
sure that the tables are matching up.

1014
00:44:12,240 --> 00:44:14,730
You don't want there to be a domestic
flight in the international flights

1015
00:44:14,730 --> 00:44:15,630
table, for instance.

1016
00:44:15,630 --> 00:44:18,255
And these are the things that
you have to start to worry about.

1017
00:44:18,255 --> 00:44:21,840
When might a query actually be
slower in this approach as opposed

1018
00:44:21,840 --> 00:44:25,470
to this approach with
just the flights table?

1019
00:44:25,470 --> 00:44:28,356
AUDIENCE: You need to bring all that
data back together again somehow

1020
00:44:28,356 --> 00:44:29,835
if you want to process [INAUDIBLE].

1021
00:44:29,835 --> 00:44:30,460
BRIAN YU: Yeah.

1022
00:44:30,460 --> 00:44:30,960
Sure.

1023
00:44:30,960 --> 00:44:34,076
Any time you would need to bring
data from all these tables together,

1024
00:44:34,076 --> 00:44:35,950
now your query is actually
going to be slower

1025
00:44:35,950 --> 00:44:38,380
because we have to first query this
table and then in a separate query,

1026
00:44:38,380 --> 00:44:39,680
query the other table.

1027
00:44:39,680 --> 00:44:43,510
So you might imagine if I wanted a
listing of all of the flights leaving

1028
00:44:43,510 --> 00:44:45,970
New York City airport,
a New York City airport,

1029
00:44:45,970 --> 00:44:48,430
then suddenly I need
to worry about not just

1030
00:44:48,430 --> 00:44:50,800
the domestic flights that are leaving
but also the international flights that

1031
00:44:50,800 --> 00:44:51,300
are leaving.

1032
00:44:51,300 --> 00:44:53,800
I might need to query both
of those tables independently

1033
00:44:53,800 --> 00:44:57,400
in order to get the information
that I can then display.

1034
00:44:57,400 --> 00:45:00,670
And that might take longer by querying
two tables than I could with just one.

1035
00:45:00,670 --> 00:45:03,910
So oftentimes, when
horizontally partitioning data,

1036
00:45:03,910 --> 00:45:06,832
it's a good idea to think about
how you're partitioning things

1037
00:45:06,832 --> 00:45:09,040
in such a way that you don't
want to partition things

1038
00:45:09,040 --> 00:45:12,100
in ways where you'll
often need information

1039
00:45:12,100 --> 00:45:13,604
that's in different partitions.

1040
00:45:13,604 --> 00:45:15,520
You'll often want to
partition things in a way

1041
00:45:15,520 --> 00:45:17,560
such that, with relative
frequency, you'll

1042
00:45:17,560 --> 00:45:19,630
only be querying for
things from a single one

1043
00:45:19,630 --> 00:45:21,394
of those individual
horizontal partitions.

1044
00:45:21,394 --> 00:45:23,560
So there's some design
thinking and design decisions

1045
00:45:23,560 --> 00:45:28,660
that have to go into play as you think
about which one of the partitions

1046
00:45:28,660 --> 00:45:32,360
to look for and how you're going
to actually partition that data.

1047
00:45:32,360 --> 00:45:35,230
Another term you might hear here
with regards to scaling databases

1048
00:45:35,230 --> 00:45:38,440
is database sharding, the
idea of that right now,

1049
00:45:38,440 --> 00:45:40,900
rather than take a single
table and split it up

1050
00:45:40,900 --> 00:45:42,921
into two tables in the
same database server,

1051
00:45:42,921 --> 00:45:44,920
we might actually have
multiple database servers

1052
00:45:44,920 --> 00:45:47,460
where I store domestic
flights on one database server

1053
00:45:47,460 --> 00:45:50,080
and international flights
on another database server.

1054
00:45:50,080 --> 00:45:51,430
What might be a benefit of that?

1055
00:45:51,430 --> 00:45:54,310


1056
00:45:54,310 --> 00:45:58,012
Where I have two independent servers,
one of which contains some of the data,

1057
00:45:58,012 --> 00:46:01,220
the domestic flights, and one of which
contains the international flights, as

1058
00:46:01,220 --> 00:46:03,761
opposed to having them in just
two tables on the same server.

1059
00:46:03,761 --> 00:46:06,488


1060
00:46:06,488 --> 00:46:08,860
AUDIENCE: It's not a single
point of failure anymore.

1061
00:46:08,860 --> 00:46:10,720
BRIAN YU: Not a single
point of failure, that if I

1062
00:46:10,720 --> 00:46:13,261
happen to-- if the international
database happens to go down,

1063
00:46:13,261 --> 00:46:16,060
I still have access to
the domestic flights.

1064
00:46:16,060 --> 00:46:19,450
What about that example
that I gave before of I

1065
00:46:19,450 --> 00:46:24,467
want to get all of the flights that
are leaving San Francisco airport?

1066
00:46:24,467 --> 00:46:25,092
AUDIENCE: Yeah.

1067
00:46:25,092 --> 00:46:27,044
So maybe you could
process the data faster

1068
00:46:27,044 --> 00:46:30,334
because each server's going to
process its own table [INAUDIBLE]..

1069
00:46:30,334 --> 00:46:31,000
BRIAN YU: Great.

1070
00:46:31,000 --> 00:46:33,030
Now I can have some
concurrency, that I can--

1071
00:46:33,030 --> 00:46:34,919
query for both the
database servers and say,

1072
00:46:34,919 --> 00:46:37,210
give me all the flights that
are leaving San Francisco.

1073
00:46:37,210 --> 00:46:39,820
And I can have the domestic server
running and the international server

1074
00:46:39,820 --> 00:46:42,580
running simultaneously and then
giving me back those results.

1075
00:46:42,580 --> 00:46:45,400
And so maybe that will help
to offset what might initially

1076
00:46:45,400 --> 00:46:48,430
have been a longer query in order
to query these two separate tables.

1077
00:46:48,430 --> 00:46:52,150
But, of course, it's still going to mean
now that I have to deal with the fact

1078
00:46:52,150 --> 00:46:54,500
that my data is located
in different places.

1079
00:46:54,500 --> 00:46:57,400
And if I ever want to do a
SQL join, for instance, if I'm

1080
00:46:57,400 --> 00:47:00,100
trying to join multiple
tables together, now the fact

1081
00:47:00,100 --> 00:47:02,200
that the tables are located
on different servers,

1082
00:47:02,200 --> 00:47:04,190
that's going to come
at a time cost as well.

1083
00:47:04,190 --> 00:47:07,880
And so as we think about database
design and on which servers

1084
00:47:07,880 --> 00:47:11,052
your table should go, all of these
are things that should come into play.

1085
00:47:11,052 --> 00:47:12,760
And they're considerations
that are going

1086
00:47:12,760 --> 00:47:15,343
to change depending on the
specific needs of your application,

1087
00:47:15,343 --> 00:47:17,680
depending on how frequently
you're going to be accessing

1088
00:47:17,680 --> 00:47:19,570
one type of data as opposed to another.

1089
00:47:19,570 --> 00:47:21,486
There are trade-offs to
think about and things

1090
00:47:21,486 --> 00:47:24,760
that you'll have to weigh as you
go about making those decisions.

1091
00:47:24,760 --> 00:47:28,150
Questions about anything with
regards to database partitioning,

1092
00:47:28,150 --> 00:47:31,550
splitting data up?

1093
00:47:31,550 --> 00:47:32,050
All right.

1094
00:47:32,050 --> 00:47:33,800
So databased partitioning,
splitting data,

1095
00:47:33,800 --> 00:47:36,940
may help to make data more manageable,
and it may help to speed up queries.

1096
00:47:36,940 --> 00:47:40,750
But it doesn't fully solve that single
point of failure problem, the problem

1097
00:47:40,750 --> 00:47:43,180
of we have two servers
that are both trying

1098
00:47:43,180 --> 00:47:45,400
to talk to the same database
that has all the data.

1099
00:47:45,400 --> 00:47:47,680
Maybe I've partitioned the data
to make our queries faster,

1100
00:47:47,680 --> 00:47:50,388
and so maybe our database can
start to handle more and more users

1101
00:47:50,388 --> 00:47:51,307
than it could before.

1102
00:47:51,307 --> 00:47:53,140
But we're still dealing
with the possibility

1103
00:47:53,140 --> 00:47:56,200
now that we have a single point of
failure where that database can fail,

1104
00:47:56,200 --> 00:47:57,850
and suddenly nothing's going to work.

1105
00:47:57,850 --> 00:48:00,130
And we're still still
dealing with the possibility

1106
00:48:00,130 --> 00:48:03,340
that we might overload the database if
we have 10 different servers that are

1107
00:48:03,340 --> 00:48:05,420
all trying to access the same database.

1108
00:48:05,420 --> 00:48:08,354
So what might we do now?

1109
00:48:08,354 --> 00:48:11,345
AUDIENCE: Wouldn't there be a
database backup system somewhere?

1110
00:48:11,345 --> 00:48:11,970
BRIAN YU: Sure.

1111
00:48:11,970 --> 00:48:14,190
A database backup system
would be a great idea.

1112
00:48:14,190 --> 00:48:16,856
And we'll often call this database
replication, the idea that we

1113
00:48:16,856 --> 00:48:18,450
don't just want one copy of our data.

1114
00:48:18,450 --> 00:48:20,700
Maybe we want multiple copies
of our data, two or even

1115
00:48:20,700 --> 00:48:22,950
three different copies
of the same database

1116
00:48:22,950 --> 00:48:26,070
that we can, therefore, help
to distribute load across.

1117
00:48:26,070 --> 00:48:29,340
In the same way that we could distribute
load between different servers,

1118
00:48:29,340 --> 00:48:31,942
we can distribute load
between the databases as well.

1119
00:48:31,942 --> 00:48:34,900
What problems start to come up now
that we're duplicating our database?

1120
00:48:34,900 --> 00:48:37,900
Now we have three different
copies of the database.

1121
00:48:37,900 --> 00:48:39,994
How do we deal with it?

1122
00:48:39,994 --> 00:48:44,470
AUDIENCE: You're going to
get some servers' data--

1123
00:48:44,470 --> 00:48:45,970
or database, yeah, they'll match up.

1124
00:48:45,970 --> 00:48:49,954
But if you have three
databases, one database

1125
00:48:49,954 --> 00:48:51,944
might have more recent
data than the other one.

1126
00:48:51,944 --> 00:48:52,610
BRIAN YU: Great.

1127
00:48:52,610 --> 00:48:54,318
So now we're dealing
with the possibility

1128
00:48:54,318 --> 00:48:56,760
that server data might not
match up with each other.

1129
00:48:56,760 --> 00:49:00,140
If I have three different databases,
what happens if I update one database?

1130
00:49:00,140 --> 00:49:02,620
What happens to the other
two databases, for instance?

1131
00:49:02,620 --> 00:49:04,000
What happens to that data?

1132
00:49:04,000 --> 00:49:07,826
So how might we resolve
those sorts of problems?

1133
00:49:07,826 --> 00:49:10,700
That problem of we need to make sure
that our three databases are all

1134
00:49:10,700 --> 00:49:12,810
in sync with each other.

1135
00:49:12,810 --> 00:49:14,897
And we want to have one
database have some data

1136
00:49:14,897 --> 00:49:16,730
and have another database
not have that data

1137
00:49:16,730 --> 00:49:19,730
because then it will change the user's
experience depending on which the

1138
00:49:19,730 --> 00:49:22,339
databases they happen to try to access.

1139
00:49:22,339 --> 00:49:24,630
AUDIENCE: I mean, it seem
like the database servers are

1140
00:49:24,630 --> 00:49:27,458
going to have to speak to each
other, lock records, update records,

1141
00:49:27,458 --> 00:49:31,949
so do the same thing that
was done to them [INAUDIBLE]

1142
00:49:31,949 --> 00:49:33,450
the other database servers.

1143
00:49:33,450 --> 00:49:33,700
BRIAN YU: Great.

1144
00:49:33,700 --> 00:49:36,790
So there's going to need to be some sort
of communication between the servers.

1145
00:49:36,790 --> 00:49:38,800
And so we'll look at a
couple of different models

1146
00:49:38,800 --> 00:49:41,091
for database replication that
are quite common in order

1147
00:49:41,091 --> 00:49:42,670
to try and deal with these problems.

1148
00:49:42,670 --> 00:49:44,840
And so we'll look at a
single-primary replication

1149
00:49:44,840 --> 00:49:46,580
and multi-primary replication.

1150
00:49:46,580 --> 00:49:49,360
And so in the single-primary
replication model

1151
00:49:49,360 --> 00:49:52,530
for database replication,
what we have is

1152
00:49:52,530 --> 00:49:54,280
we have a single, as
the name might imply,

1153
00:49:54,280 --> 00:49:56,634
a single database, which is
called our primary database,

1154
00:49:56,634 --> 00:49:58,300
which would be this database right here.

1155
00:49:58,300 --> 00:50:01,409
And on this primary database, you
could treat it like the single database

1156
00:50:01,409 --> 00:50:02,200
that we had before.

1157
00:50:02,200 --> 00:50:05,630
You can read data from it,
and you can write data to it.

1158
00:50:05,630 --> 00:50:08,200
And we also have these two
databases over here, which we're

1159
00:50:08,200 --> 00:50:10,030
going to call secondary databases.

1160
00:50:10,030 --> 00:50:13,000
And the idea of the secondary
databases is that you can only

1161
00:50:13,000 --> 00:50:15,100
read data from a secondary database.

1162
00:50:15,100 --> 00:50:17,200
You can never update to
the secondary database

1163
00:50:17,200 --> 00:50:19,030
or write to the secondary database.

1164
00:50:19,030 --> 00:50:22,240
You can only ever write,
meaning update or add a row

1165
00:50:22,240 --> 00:50:25,000
or delete a row, to
the primary database.

1166
00:50:25,000 --> 00:50:27,930
And you can select all the data
you want from the other databases,

1167
00:50:27,930 --> 00:50:31,760
but you can't update or
add or delete new rules.

1168
00:50:31,760 --> 00:50:33,650
So what's missing from this picture now?

1169
00:50:33,650 --> 00:50:37,721
What needs to happen any time a
write to this database happens?

1170
00:50:37,721 --> 00:50:40,744
AUDIENCE: [INAUDIBLE]

1171
00:50:40,744 --> 00:50:41,410
BRIAN YU: Great.

1172
00:50:41,410 --> 00:50:44,530
We need this update mechanism, that
whenever we write to this database

1173
00:50:44,530 --> 00:50:48,280
here, our primary database, our primary
database needs to update this database

1174
00:50:48,280 --> 00:50:52,060
and update this database, tell the
secondary databases that new data has

1175
00:50:52,060 --> 00:50:55,030
been added or removed or
updated and changed in some way

1176
00:50:55,030 --> 00:50:58,370
in order to make sure that those
new databases reflect those changes.

1177
00:50:58,370 --> 00:51:03,010
And so under this model,
we're able to implement

1178
00:51:03,010 --> 00:51:06,580
this idea of replicating the databases
and making sure the databases that

1179
00:51:06,580 --> 00:51:09,970
are staying in sync because we're
only ever able to make changes

1180
00:51:09,970 --> 00:51:11,500
on this database over here.

1181
00:51:11,500 --> 00:51:14,860
And when we do, that database is going
to update the secondary databases.

1182
00:51:14,860 --> 00:51:19,647
It's going to make sure that those
databases are aware of those changes.

1183
00:51:19,647 --> 00:51:21,230
What are some drawbacks of this model?

1184
00:51:21,230 --> 00:51:24,990


1185
00:51:24,990 --> 00:51:26,644
AUDIENCE: Timing.

1186
00:51:26,644 --> 00:51:27,810
BRIAN YU: Timing, certainly.

1187
00:51:27,810 --> 00:51:29,643
We might deal with
potential race conditions

1188
00:51:29,643 --> 00:51:33,020
here, that if we write
some data to this database

1189
00:51:33,020 --> 00:51:36,007
and we try and read it from
some other database before there

1190
00:51:36,007 --> 00:51:38,090
has been time in order to
make that update happen,

1191
00:51:38,090 --> 00:51:40,084
that can potentially cause problems.

1192
00:51:40,084 --> 00:51:43,048
AUDIENCE: It seems like in general
reading is going to be pretty good.

1193
00:51:43,048 --> 00:51:47,000
But writing is what's going to
really take a hit, especially

1194
00:51:47,000 --> 00:51:52,928
because that server that you're writing
to has to use its resources to write

1195
00:51:52,928 --> 00:51:54,104
in the other databases.

1196
00:51:54,104 --> 00:51:54,770
BRIAN YU: Great.

1197
00:51:54,770 --> 00:51:58,280
So this model seems
pretty good if we're doing

1198
00:51:58,280 --> 00:51:59,780
a lot of reading from our database.

1199
00:51:59,780 --> 00:52:02,180
So you might imagine that,
depending upon the web application,

1200
00:52:02,180 --> 00:52:03,596
this might be a really good model.

1201
00:52:03,596 --> 00:52:10,550
If you imagine a model like a blog or
a news website, where most of the time,

1202
00:52:10,550 --> 00:52:12,320
when someone's accessing
the news website,

1203
00:52:12,320 --> 00:52:15,500
they're just reading the stories that
are published on the news website.

1204
00:52:15,500 --> 00:52:19,170
And it's not like they're adding
new stories constantly to the--

1205
00:52:19,170 --> 00:52:21,550
there's fewer times that
people are adding stories

1206
00:52:21,550 --> 00:52:23,758
than they are reading stories,
so reads are happening

1207
00:52:23,758 --> 00:52:25,200
a lot more frequently than writes.

1208
00:52:25,200 --> 00:52:28,280
Then this might be a perfectly
acceptable model, where we just

1209
00:52:28,280 --> 00:52:30,852
have a lot of databases
that are reading but only

1210
00:52:30,852 --> 00:52:33,810
one database where we can actually
write to because that's less common.

1211
00:52:33,810 --> 00:52:37,440
But if writes were more common, this
model starts to look not as good.

1212
00:52:37,440 --> 00:52:40,830
We still have a single point of failure,
that if this database breaks down,

1213
00:52:40,830 --> 00:52:43,700
now suddenly we're not able
to do writes to the database.

1214
00:52:43,700 --> 00:52:47,460
And if a lot of people are trying to
write to the database at the same time,

1215
00:52:47,460 --> 00:52:51,290
we're not able to distribute that load
because the only place where we can do

1216
00:52:51,290 --> 00:52:56,490
writes is on this primary database over
here and not on the secondary database.

1217
00:52:56,490 --> 00:53:00,830
And so a solution to that, rather than
just using single-primary replication,

1218
00:53:00,830 --> 00:53:04,796
will be multi-primary replication,
which, as the name might suggest,

1219
00:53:04,796 --> 00:53:07,670
is where, instead of having a single
primary database and some number

1220
00:53:07,670 --> 00:53:11,900
of secondary databases, we have multiple
different primary databases where

1221
00:53:11,900 --> 00:53:14,610
for each one you can read and
write data to each of them.

1222
00:53:14,610 --> 00:53:16,970
So now it's not just reads
that can be distributed

1223
00:53:16,970 --> 00:53:18,530
across a number of different servers.

1224
00:53:18,530 --> 00:53:19,370
It's writes as well.

1225
00:53:19,370 --> 00:53:21,350
We can add rows, delete
rows, update rows

1226
00:53:21,350 --> 00:53:26,220
across any of the servers in this
multi-primary replication model.

1227
00:53:26,220 --> 00:53:27,350
So what's the catch?

1228
00:53:27,350 --> 00:53:31,310
Why might this be more of a challenge?

1229
00:53:31,310 --> 00:53:34,590
Or what are the trade-offs here?

1230
00:53:34,590 --> 00:53:38,398
AUDIENCE: It seems like if you drew
all your arrows that were updating

1231
00:53:38,398 --> 00:53:39,826
in the back-end there--

1232
00:53:39,826 --> 00:53:41,527
BRIAN YU: If we draw all the arrows--

1233
00:53:41,527 --> 00:53:45,820
AUDIENCE: So it's like, what is the
difference of having three versus one?

1234
00:53:45,820 --> 00:53:47,260
How would it start to look like?

1235
00:53:47,260 --> 00:53:47,510
BRIAN YU: Yeah.

1236
00:53:47,510 --> 00:53:47,770
Sure.

1237
00:53:47,770 --> 00:53:49,950
So, certainly, once we start
to draw all the arrows,

1238
00:53:49,950 --> 00:53:52,500
all the updates that have to happen,
updates between all the different

1239
00:53:52,500 --> 00:53:55,416
databases going in both directions--
server one needs to update server

1240
00:53:55,416 --> 00:53:57,230
three, and three needs to update one--

1241
00:53:57,230 --> 00:53:59,610
that this picture starts
to get complicated.

1242
00:53:59,610 --> 00:54:03,340
And it starts to introduce potential
problems that could come up.

1243
00:54:03,340 --> 00:54:06,515
So, certainly, one is that as we
have more and more databases, now

1244
00:54:06,515 --> 00:54:08,640
we need to have more and
more of these updates that

1245
00:54:08,640 --> 00:54:10,380
are happening with each other.

1246
00:54:10,380 --> 00:54:12,810
And what other problems
can come up now that we

1247
00:54:12,810 --> 00:54:15,450
have this, all of
these different updates

1248
00:54:15,450 --> 00:54:17,990
that are all trying
to update each other?

1249
00:54:17,990 --> 00:54:18,630
Yeah.

1250
00:54:18,630 --> 00:54:23,040
AUDIENCE: If someone is trying to write
to two databases at the same time,

1251
00:54:23,040 --> 00:54:27,064
then you might have duplicate
information that doesn't match up.

1252
00:54:27,064 --> 00:54:27,730
BRIAN YU: Great.

1253
00:54:27,730 --> 00:54:30,730
What happens if two users are
trying to edit, make updates

1254
00:54:30,730 --> 00:54:33,250
to two different databases
at the same time and they

1255
00:54:33,250 --> 00:54:36,080
both register those updates and
now try to update each other?

1256
00:54:36,080 --> 00:54:40,000
So there are a number of different ways
that these conflicts can come about.

1257
00:54:40,000 --> 00:54:42,880
One is a primary key
conflict, where imagine

1258
00:54:42,880 --> 00:54:46,570
if there are 27 users
inside of a user's database,

1259
00:54:46,570 --> 00:54:51,220
and a user registers on this database,
and a user registers on this database.

1260
00:54:51,220 --> 00:54:53,950
Well, user over here, they
get added to the user's table.

1261
00:54:53,950 --> 00:54:55,330
There were 27 users before.

1262
00:54:55,330 --> 00:54:57,464
So new users going to be user number 28.

1263
00:54:57,464 --> 00:54:59,380
And then over here, if
the user is registering

1264
00:54:59,380 --> 00:55:02,470
at the same time, some
different user, this database

1265
00:55:02,470 --> 00:55:04,150
also sees that there are 27 users.

1266
00:55:04,150 --> 00:55:07,390
And so it's also going to add
this is now user number 28.

1267
00:55:07,390 --> 00:55:11,230
Now we have two different users
that both have ID number 28.

1268
00:55:11,230 --> 00:55:14,440
And so when all the updates happen and
they try to sync up with each other,

1269
00:55:14,440 --> 00:55:16,356
now we're going to run
into potential problems

1270
00:55:16,356 --> 00:55:18,854
because now we have two rows
that have the same ID field.

1271
00:55:18,854 --> 00:55:20,770
And that's not allowable
because our ID field,

1272
00:55:20,770 --> 00:55:23,210
presumably, is supposed to be unique.

1273
00:55:23,210 --> 00:55:25,180
So that's one potential problem.

1274
00:55:25,180 --> 00:55:28,660
Other potential problems
include two different databases

1275
00:55:28,660 --> 00:55:31,660
trying to update the same row
at the same time, for instance.

1276
00:55:31,660 --> 00:55:35,440
If they're both trying to change the
duration of a flight, for instance,

1277
00:55:35,440 --> 00:55:37,910
and one wants to change
it to 120 minutes

1278
00:55:37,910 --> 00:55:41,830
and one is trying to change
it to 150 minutes, and now

1279
00:55:41,830 --> 00:55:44,920
which one of those databases
should we listen to?

1280
00:55:44,920 --> 00:55:47,414
And all sorts of other
problems could come up.

1281
00:55:47,414 --> 00:55:50,080
If someone tries to, for instance,
delete a row at the same time

1282
00:55:50,080 --> 00:55:52,381
that someone else is trying
to edit that same row,

1283
00:55:52,381 --> 00:55:54,880
should the edit take precedence
over the delete and keep it?

1284
00:55:54,880 --> 00:55:56,680
Or do we delete it and ignore the edit?

1285
00:55:56,680 --> 00:55:59,620
All of these are conflicts
that, ultimately, whatever

1286
00:55:59,620 --> 00:56:02,230
multi-primary replication
system you're trying to use

1287
00:56:02,230 --> 00:56:03,969
needs to have rules
for how to deal with,

1288
00:56:03,969 --> 00:56:06,760
some systematic way of saying, all
right, if these two edits happen

1289
00:56:06,760 --> 00:56:10,660
at the same time, then we
should need some mechanism

1290
00:56:10,660 --> 00:56:12,370
of trying to resolve those edits.

1291
00:56:12,370 --> 00:56:15,490
Maybe if they're editing
different columns, then it's fine.

1292
00:56:15,490 --> 00:56:17,470
Just update both columns for both rows.

1293
00:56:17,470 --> 00:56:20,980
But if they're editing the
same column of the same row,

1294
00:56:20,980 --> 00:56:24,670
then maybe check the time at which it
happened and go with the more recent.

1295
00:56:24,670 --> 00:56:26,920
And so there are any
number of different rules

1296
00:56:26,920 --> 00:56:30,260
that might get increasingly more complex
or sophisticated that come into play.

1297
00:56:30,260 --> 00:56:33,190
But the idea is that the
additional complexity

1298
00:56:33,190 --> 00:56:35,620
that we face with
multi-primary replication

1299
00:56:35,620 --> 00:56:38,770
is that we need some mechanism
for resolving those conflicts.

1300
00:56:38,770 --> 00:56:41,260
We need some way of saying,
if two databases that

1301
00:56:41,260 --> 00:56:44,457
are trying to perform updates and
those updates conflict with each other,

1302
00:56:44,457 --> 00:56:46,040
how should we deal with those updates?

1303
00:56:46,040 --> 00:56:48,340
And we need rules for how to do that.

1304
00:56:48,340 --> 00:56:52,960
Questions about either single-primary
or multi-primary replication or the idea

1305
00:56:52,960 --> 00:56:54,550
of database replication in general?

1306
00:56:54,550 --> 00:56:57,880


1307
00:56:57,880 --> 00:56:58,780
OK.

1308
00:56:58,780 --> 00:57:02,440
So one more topic that we'll talk about
with regards to trying to scale up data

1309
00:57:02,440 --> 00:57:03,220
is caching.

1310
00:57:03,220 --> 00:57:06,370
And this is something that will
become-- that's very useful as data

1311
00:57:06,370 --> 00:57:07,330
begins to scale.

1312
00:57:07,330 --> 00:57:12,529
And this is all about trying to avoid
needing to spend too much time doing

1313
00:57:12,529 --> 00:57:13,820
things that we've already done.

1314
00:57:13,820 --> 00:57:16,720
So the idea of caching is
taking data and information

1315
00:57:16,720 --> 00:57:19,960
and storing it in some
temporary place for usage later.

1316
00:57:19,960 --> 00:57:23,410
You might imagine that on
The New York Times website,

1317
00:57:23,410 --> 00:57:25,990
for instance, the home
page of The New York Times

1318
00:57:25,990 --> 00:57:29,817
probably isn't changing too much
from one second to the next second.

1319
00:57:29,817 --> 00:57:32,400
Sure, after some number of
minutes, a new article might go up,

1320
00:57:32,400 --> 00:57:34,180
and the front page might change.

1321
00:57:34,180 --> 00:57:36,190
But if you load the
page and then refresh

1322
00:57:36,190 --> 00:57:38,430
the page, the page that
you get again, it's

1323
00:57:38,430 --> 00:57:41,500
probably going to be the exact
same page in all likelihood.

1324
00:57:41,500 --> 00:57:44,080
And it probably wouldn't
make a whole lot of sense,

1325
00:57:44,080 --> 00:57:47,410
then, for every time someone tries to
request the front page of The New York

1326
00:57:47,410 --> 00:57:50,650
Times, for The New York
Times to go to its database

1327
00:57:50,650 --> 00:57:55,330
and look up what the most popular recent
articles are and look up the latest

1328
00:57:55,330 --> 00:57:58,627
images and what the trending news is
and then regenerate that whole page

1329
00:57:58,627 --> 00:58:00,460
and then present it
back to you because it's

1330
00:58:00,460 --> 00:58:02,860
going to have to do that for you
every time you make a request

1331
00:58:02,860 --> 00:58:04,693
and do it for every
other user who is trying

1332
00:58:04,693 --> 00:58:07,550
to access the front page of The
New York Times every single time.

1333
00:58:07,550 --> 00:58:10,720
And so what might be a good idea is
introducing some idea of caching--

1334
00:58:10,720 --> 00:58:15,850
the idea of saving what the
front page looked like such that

1335
00:58:15,850 --> 00:58:18,226
if a user comes back
in a couple seconds,

1336
00:58:18,226 --> 00:58:20,350
and the page hasn't changed,
then go ahead and just

1337
00:58:20,350 --> 00:58:22,460
present the same page, for instance.

1338
00:58:22,460 --> 00:58:26,226
So there are multiple different
ways by which caching can happen.

1339
00:58:26,226 --> 00:58:28,600
Caching can exist on the client
side and the server side,

1340
00:58:28,600 --> 00:58:29,891
and we'll look at both of them.

1341
00:58:29,891 --> 00:58:31,600
And we'll start with
client-side caching.

1342
00:58:31,600 --> 00:58:34,390
And this is something you might
already have some familiarity with.

1343
00:58:34,390 --> 00:58:37,300
That if you've been working with
JavaScript files in project two,

1344
00:58:37,300 --> 00:58:40,210
for instance, and you've made
edits to your JavaScript file,

1345
00:58:40,210 --> 00:58:43,880
and then you check your web
page, what sometimes happens?

1346
00:58:43,880 --> 00:58:44,380
Yeah.

1347
00:58:44,380 --> 00:58:46,170
AUDIENCE: You still
get the old JavaScript.

1348
00:58:46,170 --> 00:58:47,400
BRIAN YU: You still get
the old JavaScript file.

1349
00:58:47,400 --> 00:58:47,660
Right.

1350
00:58:47,660 --> 00:58:50,080
Even though you've made
changes to your JavaScript file

1351
00:58:50,080 --> 00:58:53,409
and you've saved those changes, when on
your web browser you refresh the page

1352
00:58:53,409 --> 00:58:56,200
or go back to that page that's
supposed to have the new JavaScript,

1353
00:58:56,200 --> 00:58:58,390
you still get the old
JavaScript because--

1354
00:58:58,390 --> 00:59:00,640
and the reason for that
is client-side caching.

1355
00:59:00,640 --> 00:59:03,310
Your web browser has saved
the old JavaScript file,

1356
00:59:03,310 --> 00:59:06,520
and it's just assuming that that
file probably hasn't changed.

1357
00:59:06,520 --> 00:59:10,450
And therefore, rather than go
through the additional time

1358
00:59:10,450 --> 00:59:13,510
expense of ask the server to
send me the JavaScript file,

1359
00:59:13,510 --> 00:59:16,090
get the JavaScript file back,
it's just looking locally

1360
00:59:16,090 --> 00:59:18,580
to its own cache, which
is faster to access

1361
00:59:18,580 --> 00:59:21,260
and just using that
JavaScript file instead.

1362
00:59:21,260 --> 00:59:24,799
And so while that might be an annoying
use case of caching, in practice

1363
00:59:24,799 --> 00:59:26,590
it's actually quite
helpful if we ever want

1364
00:59:26,590 --> 00:59:30,010
to have some resource that is going
to persist for some amount of time,

1365
00:59:30,010 --> 00:59:34,034
something that we want to
be kept inside of the cache,

1366
00:59:34,034 --> 00:59:36,200
because it's probably not
going to change too often.

1367
00:59:36,200 --> 00:59:40,300
And so inside of an HTTP
response, when the web server

1368
00:59:40,300 --> 00:59:43,900
responds back to the user and presents
it with the body of the response,

1369
00:59:43,900 --> 00:59:46,180
it contains the page to actually load.

1370
00:59:46,180 --> 00:59:51,672
The server also responds with HTTP
headers, information about the request

1371
00:59:51,672 --> 00:59:53,380
that the client web
browser, whether it's

1372
00:59:53,380 --> 00:59:56,200
Chrome or Safari or something
else, knows how to interpret

1373
00:59:56,200 --> 00:59:57,520
and knows how to understand.

1374
00:59:57,520 --> 01:00:01,750
And so one of those headers might
be this cache-control header.

1375
01:00:01,750 --> 01:00:04,630
And what the cache-control
HTTP header is allowed to do

1376
01:00:04,630 --> 01:00:10,310
is it's allowed to set in the most
basic case a maximum age for the page.

1377
01:00:10,310 --> 01:00:13,840
In other words, specify after
this number of seconds--

1378
01:00:13,840 --> 01:00:16,570
or in this case, one day, I believe--

1379
01:00:16,570 --> 01:00:20,230
that's when you should, if
you're requesting the page again,

1380
01:00:20,230 --> 01:00:21,940
actually see if something has changed.

1381
01:00:21,940 --> 01:00:24,070
But within this amount of
time, the page probably

1382
01:00:24,070 --> 01:00:28,570
hasn't changed, so don't worry about
trying to access it again if you're

1383
01:00:28,570 --> 01:00:30,250
trying to load the same page again.

1384
01:00:30,250 --> 01:00:32,350
Just use the cached version of it.

1385
01:00:32,350 --> 01:00:36,130
And so by putting a line like this
inside of your HTTP header-- and web

1386
01:00:36,130 --> 01:00:38,320
frameworks like Flask
and Django have ways

1387
01:00:38,320 --> 01:00:42,466
of allowing you to edit what goes into
the header, and you can set those.

1388
01:00:42,466 --> 01:00:44,590
And you can look at Flask
or Django's documentation

1389
01:00:44,590 --> 01:00:46,000
for looking at how to do that.

1390
01:00:46,000 --> 01:00:48,130
But you can say to the
web browser, go ahead

1391
01:00:48,130 --> 01:00:51,160
and save this page in the
cache for a day or so such that

1392
01:00:51,160 --> 01:00:54,310
if you come back in a couple hours,
no need to contact the server again,

1393
01:00:54,310 --> 01:00:57,460
which might add additional load to
the server when it's unnecessary.

1394
01:00:57,460 --> 01:01:00,740
Just go ahead and load that same page.

1395
01:01:00,740 --> 01:01:04,820
So what problems can happen here with
cache, with caching on the client side,

1396
01:01:04,820 --> 01:01:06,600
having the web browser cache the page?

1397
01:01:06,600 --> 01:01:12,200


1398
01:01:12,200 --> 01:01:12,700
Yeah.

1399
01:01:12,700 --> 01:01:15,340
AUDIENCE: If the page changes
sooner, then you wouldn't know.

1400
01:01:15,340 --> 01:01:15,520
BRIAN YU: Yeah.

1401
01:01:15,520 --> 01:01:16,020
Sure.

1402
01:01:16,020 --> 01:01:19,420
Certainly, if the page changes
sooner than this amount of time, then

1403
01:01:19,420 --> 01:01:21,429
when the user tries to
go back to that page,

1404
01:01:21,429 --> 01:01:24,220
there's a good chance that they're
still going to see the old page,

1405
01:01:24,220 --> 01:01:26,761
that they'll get the old page,
and it will load very quickly.

1406
01:01:26,761 --> 01:01:29,489
But whatever the newer version
was, they're not going to see it.

1407
01:01:29,489 --> 01:01:31,280
And, certainly, there
are ways around this.

1408
01:01:31,280 --> 01:01:34,270
You can hard refresh the page, which
usually is going to try and clear

1409
01:01:34,270 --> 01:01:35,410
the cache and just say--

1410
01:01:35,410 --> 01:01:38,440
really try and access the page by
actually talking to the server.

1411
01:01:38,440 --> 01:01:41,720
And so that's something
you can do as well.

1412
01:01:41,720 --> 01:01:45,130
But what about the case
where maybe this is saying

1413
01:01:45,130 --> 01:01:46,780
you can cache the page for a day?

1414
01:01:46,780 --> 01:01:49,600
What if the next day
the page hasn't changed

1415
01:01:49,600 --> 01:01:52,060
or three days later the
page still hasn't changed?

1416
01:01:52,060 --> 01:01:55,900
Under this model, we would still go
back to the server and say, a day's up,

1417
01:01:55,900 --> 01:01:58,510
so it means that I can't
use the cache page anymore.

1418
01:01:58,510 --> 01:02:00,580
I need to go to the server
and ask for it again.

1419
01:02:00,580 --> 01:02:02,330
But imagine if it's some big file.

1420
01:02:02,330 --> 01:02:06,740
It's a video or some other large
file that might take a long time.

1421
01:02:06,740 --> 01:02:08,684
It wouldn't make a
whole lot of sense if we

1422
01:02:08,684 --> 01:02:10,600
were trying to redownloaded
it again and again

1423
01:02:10,600 --> 01:02:12,580
and again just because the cache was up.

1424
01:02:12,580 --> 01:02:16,450
And so what's a way that
we might be able to enforce

1425
01:02:16,450 --> 01:02:22,720
this idea that the server can send
new data if there's been a change

1426
01:02:22,720 --> 01:02:24,631
but doesn't need to?

1427
01:02:24,631 --> 01:02:25,130
Yeah.

1428
01:02:25,130 --> 01:02:29,507
AUDIENCE: Have some ID that every time
the server makes a change that they

1429
01:02:29,507 --> 01:02:33,220
can, for example, increment that ID
and then just use the headers first

1430
01:02:33,220 --> 01:02:37,254
to see if you have-- if your headers
match up, then don't [INAUDIBLE]..

1431
01:02:37,254 --> 01:02:37,920
BRIAN YU: Great.

1432
01:02:37,920 --> 01:02:40,290
So we can use some kind of
identifier, some identifier that's

1433
01:02:40,290 --> 01:02:43,080
associated with the resource,
whether it's a web page or a video

1434
01:02:43,080 --> 01:02:46,170
or something else, where any
time that resource is updated,

1435
01:02:46,170 --> 01:02:47,460
we update that header.

1436
01:02:47,460 --> 01:02:50,730
And in HTTP, we call that
header an ETag, or an entity

1437
01:02:50,730 --> 01:02:54,330
tag, which can just be a really long
hexadecimal sequence, a sequence

1438
01:02:54,330 --> 01:02:57,570
of numbers and characters,
where that is going

1439
01:02:57,570 --> 01:03:01,272
to be uniquely associated with a
particular version of a resource.

1440
01:03:01,272 --> 01:03:02,730
That if the resource gets updated--

1441
01:03:02,730 --> 01:03:04,620
I update the page, or
I update the video--

1442
01:03:04,620 --> 01:03:06,760
then this ETag is going to change.

1443
01:03:06,760 --> 01:03:10,670
And so now how can we use
the ETag to implement caching

1444
01:03:10,670 --> 01:03:12,420
or to implement the
idea that I don't need

1445
01:03:12,420 --> 01:03:15,510
to redownload the page every
time because of this ETag?

1446
01:03:15,510 --> 01:03:16,350
What can I do?

1447
01:03:16,350 --> 01:03:16,850
Yeah.

1448
01:03:16,850 --> 01:03:20,203
AUDIENCE: Every time you're doing a
get request to a server sent, the ETag

1449
01:03:20,203 --> 01:03:23,556
that you have, and then the
server, if it matches up,

1450
01:03:23,556 --> 01:03:25,472
it'll tell you like, no need to reload.

1451
01:03:25,472 --> 01:03:27,484
Otherwise, it will
send you the new file.

1452
01:03:27,484 --> 01:03:28,150
BRIAN YU: Great.

1453
01:03:28,150 --> 01:03:31,420
So when the user is trying
to request the page,

1454
01:03:31,420 --> 01:03:34,780
the user can send along the ETag with
the request, say, here is the ETag,

1455
01:03:34,780 --> 01:03:37,270
here's the version of
the request that I have.

1456
01:03:37,270 --> 01:03:39,454
And the server can look
at that ETag and say,

1457
01:03:39,454 --> 01:03:41,870
does this match up with the
latest version of the resource

1458
01:03:41,870 --> 01:03:43,310
that I have on the server?

1459
01:03:43,310 --> 01:03:45,100
And if it does match
up, then rather than

1460
01:03:45,100 --> 01:03:50,120
send the whole contents of the page,
rather than send the whole video again,

1461
01:03:50,120 --> 01:03:53,620
the server can just respond
with usually a 305 status code--

1462
01:03:53,620 --> 01:03:56,290
304, which stands for not
modified-- just to say there's

1463
01:03:56,290 --> 01:03:58,780
been no change to the content
you're trying to request.

1464
01:03:58,780 --> 01:04:00,820
Go ahead and just use
your cache version.

1465
01:04:00,820 --> 01:04:03,730
It's still fresh, and it's
not stale, as we'll often

1466
01:04:03,730 --> 01:04:05,590
call it with regards to caching.

1467
01:04:05,590 --> 01:04:08,290
And the result of that is the
responds can happen quickly.

1468
01:04:08,290 --> 01:04:10,150
The server doesn't
have to get too loaded.

1469
01:04:10,150 --> 01:04:12,499
And the client can know
the ETag is the same.

1470
01:04:12,499 --> 01:04:13,540
The resource is the same.

1471
01:04:13,540 --> 01:04:15,640
I can just use the version in the cache.

1472
01:04:15,640 --> 01:04:17,860
Of course, on the flip
side of things, if the user

1473
01:04:17,860 --> 01:04:21,110
sends along an ETag with the request
saying, I'm requesting this page,

1474
01:04:21,110 --> 01:04:24,220
here's the ETag of the last
time that I visited this page,

1475
01:04:24,220 --> 01:04:27,640
if the page has changed and the server
detects that, OK, wait a minute,

1476
01:04:27,640 --> 01:04:31,630
this ETag is different from the ETag
of the latest version of the resource,

1477
01:04:31,630 --> 01:04:34,720
now the server knows
that we need to give

1478
01:04:34,720 --> 01:04:36,520
the user a fresh copy of that resource.

1479
01:04:36,520 --> 01:04:39,490
And the server can now do that
processing, get the resource,

1480
01:04:39,490 --> 01:04:40,880
and deliver it to the user.

1481
01:04:40,880 --> 01:04:43,540
And so this client-side caching
serves two benefits, really.

1482
01:04:43,540 --> 01:04:48,370
Number one is that it's faster for the
user, that from the user perspective

1483
01:04:48,370 --> 01:04:50,429
they can often see the
resource load faster

1484
01:04:50,429 --> 01:04:52,720
because it's loading from
their own computer as opposed

1485
01:04:52,720 --> 01:04:56,045
to having to be transferred over the
internet from one server to the client.

1486
01:04:56,045 --> 01:04:58,420
And on the other side, it
helps from the load perspective

1487
01:04:58,420 --> 01:05:01,544
that if you have hundreds of users and
are all trying to access your server

1488
01:05:01,544 --> 01:05:03,320
and access your database
at the same time,

1489
01:05:03,320 --> 01:05:05,920
any time you can tell some
subset of those users,

1490
01:05:05,920 --> 01:05:08,470
you don't really need to access
the server or the database,

1491
01:05:08,470 --> 01:05:10,910
you can just use a version
of the site you already have,

1492
01:05:10,910 --> 01:05:12,040
that's going to be a benefit to you.

1493
01:05:12,040 --> 01:05:13,930
That's going to be less
load on your website.

1494
01:05:13,930 --> 01:05:17,710
That's going to help you as you
think about scaling your website.

1495
01:05:17,710 --> 01:05:19,960
Questions about client-side caching?

1496
01:05:19,960 --> 01:05:22,960


1497
01:05:22,960 --> 01:05:23,460
All right.

1498
01:05:23,460 --> 01:05:25,500
So let's talk about
server-side caching, which

1499
01:05:25,500 --> 01:05:27,120
is another place where caches can be.

1500
01:05:27,120 --> 01:05:29,850
And caches can exist all
throughout this entire process,

1501
01:05:29,850 --> 01:05:31,290
whether they're large or small.

1502
01:05:31,290 --> 01:05:33,020
They can be located in different places.

1503
01:05:33,020 --> 01:05:37,830
And one thing we didn't mention
was that if you have a cache that

1504
01:05:37,830 --> 01:05:40,230
maybe works for your entire network--

1505
01:05:40,230 --> 01:05:42,550
actually, we'll talk
about that one more time.

1506
01:05:42,550 --> 01:05:47,394
So imagine that you have some cache
that's working for your local network,

1507
01:05:47,394 --> 01:05:49,560
for instance, your computer
and other computers that

1508
01:05:49,560 --> 01:05:52,830
are all connected to the same network.

1509
01:05:52,830 --> 01:05:58,080
What could go wrong with something
like cache-control or the ETag

1510
01:05:58,080 --> 01:06:02,628
where you might not want
for the page to be cached?

1511
01:06:02,628 --> 01:06:03,765
AUDIENCE: Security.

1512
01:06:03,765 --> 01:06:05,140
BRIAN YU: Security reasons, sure.

1513
01:06:05,140 --> 01:06:08,200
You might imagine that there's a
difference between public websites

1514
01:06:08,200 --> 01:06:12,910
and private websites or private pages,
that if facebook.com, for instance,

1515
01:06:12,910 --> 01:06:15,130
were something that were
just consistently cached,

1516
01:06:15,130 --> 01:06:19,870
then if I visited Facebook and
saw my news feed and it was cache,

1517
01:06:19,870 --> 01:06:23,260
then I wouldn't want it for-- if someone
else on my network or someone else

1518
01:06:23,260 --> 01:06:26,469
using my computer were to also
go to Facebook, if they were to--

1519
01:06:26,469 --> 01:06:29,260
on their account, if they were to
also see the same content of what

1520
01:06:29,260 --> 01:06:31,301
I just saw because it was
pulling from the cache.

1521
01:06:31,301 --> 01:06:34,150
And so inside this cache-control
header, you additionally

1522
01:06:34,150 --> 01:06:36,969
have the option of specifying do
I want this to be a public cache,

1523
01:06:36,969 --> 01:06:39,260
meaning a page that anyone
can see, or a private cache,

1524
01:06:39,260 --> 01:06:40,600
which should be authenticated.

1525
01:06:40,600 --> 01:06:43,780
And so there are additional settings
inside the cache-control that you

1526
01:06:43,780 --> 01:06:46,270
can set in order to make sure
the cache is behaving the way

1527
01:06:46,270 --> 01:06:47,170
that you want it to behave.

1528
01:06:47,170 --> 01:06:49,086
We won't go into too
many of the details here.

1529
01:06:49,086 --> 01:06:52,570
But know that you have that kind of
control and flexibility over the cache

1530
01:06:52,570 --> 01:06:55,870
just by setting it inside of
the headers of the HTTP response

1531
01:06:55,870 --> 01:06:57,930
that you're sending back to the user.

1532
01:06:57,930 --> 01:06:59,820
But back to server-side caching.

1533
01:06:59,820 --> 01:07:01,570
So the idea of server-side
caching now is,

1534
01:07:01,570 --> 01:07:04,300
instead of having the
cache stored locally

1535
01:07:04,300 --> 01:07:07,090
on the computer of the user
inside of chrome or Safari

1536
01:07:07,090 --> 01:07:10,090
or whatever web browser they're
using, we can add to our model

1537
01:07:10,090 --> 01:07:12,760
here, where in addition to
having servers that are all

1538
01:07:12,760 --> 01:07:15,040
talking to the database,
have all the servers that

1539
01:07:15,040 --> 01:07:16,446
are now connected to the cache.

1540
01:07:16,446 --> 01:07:19,570
Now, of course, we have a whole bunch
of new single points of failure here.

1541
01:07:19,570 --> 01:07:23,200
Our databases is a single point
of failure, and so is our cache.

1542
01:07:23,200 --> 01:07:26,812
But why might we want to
add a cache to this image?

1543
01:07:26,812 --> 01:07:28,270
It certainly complicates the image.

1544
01:07:28,270 --> 01:07:30,890
But what benefit do we get from it?

1545
01:07:30,890 --> 01:07:31,544
Yeah, sure.

1546
01:07:31,544 --> 01:07:34,508
AUDIENCE: So if something is common,
then you still write to the cache

1547
01:07:34,508 --> 01:07:37,142
so you don't have to hit the
database because the-- well,

1548
01:07:37,142 --> 01:07:39,584
the cache is faster than the database.

1549
01:07:39,584 --> 01:07:40,250
BRIAN YU: Great.

1550
01:07:40,250 --> 01:07:42,110
So the cache is likely
going to be faster

1551
01:07:42,110 --> 01:07:44,030
than the database in
certain respects, usually

1552
01:07:44,030 --> 01:07:46,760
because if we're trying to
render something complicated,

1553
01:07:46,760 --> 01:07:49,430
like the front page
of The New York Times

1554
01:07:49,430 --> 01:07:53,870
or you imagine Amazon has a page
that shows the most popular books,

1555
01:07:53,870 --> 01:07:57,260
it might take a fair amount of
energy and computational resources

1556
01:07:57,260 --> 01:07:59,550
to query the most popular
books from the database.

1557
01:07:59,550 --> 01:08:00,050
Right?

1558
01:08:00,050 --> 01:08:01,883
There's some algorithm
involved whereby it's

1559
01:08:01,883 --> 01:08:04,764
going to look at books that have
been purchased frequently recently.

1560
01:08:04,764 --> 01:08:07,430
So it might need to look for
orders and look at different books,

1561
01:08:07,430 --> 01:08:09,138
and it might be multiple
different tables

1562
01:08:09,138 --> 01:08:12,890
that have to be queried in order to get
what are the top 10 most popular books.

1563
01:08:12,890 --> 01:08:15,500
And that's going to be an
expensive database operation.

1564
01:08:15,500 --> 01:08:19,119
Whereas once you've gotten those 10
most popular books the first time,

1565
01:08:19,119 --> 01:08:21,410
one thing you can do is just
take all that information,

1566
01:08:21,410 --> 01:08:25,189
put it inside the cache, and store it
there, whereby on future instances,

1567
01:08:25,189 --> 01:08:28,596
if a user comes by within the
next couple seconds and says,

1568
01:08:28,596 --> 01:08:30,470
I want to see the Amazon
home page and I want

1569
01:08:30,470 --> 01:08:32,729
to see what the 10
most popular books are,

1570
01:08:32,729 --> 01:08:35,479
rather than repeat those queries
again and go back to the database

1571
01:08:35,479 --> 01:08:37,939
and query that information,
we can just look to the cache

1572
01:08:37,939 --> 01:08:41,840
where we've stored potentially in a file
somewhere here are the 10 most popular

1573
01:08:41,840 --> 01:08:42,620
books.

1574
01:08:42,620 --> 01:08:44,720
And we can just use
that cache information

1575
01:08:44,720 --> 01:08:47,977
to display that information
back to the user.

1576
01:08:47,977 --> 01:08:49,310
So what drawbacks come up there?

1577
01:08:49,310 --> 01:08:51,309
What are the trade-offs
we face when we do that?

1578
01:08:51,309 --> 01:08:54,542


1579
01:08:54,542 --> 01:08:57,084
AUDIENCE: You need to take care
of when to update the cache.

1580
01:08:57,084 --> 01:08:57,750
BRIAN YU: Great.

1581
01:08:57,750 --> 01:08:59,609
So any time we're
dealing with the cache,

1582
01:08:59,609 --> 01:09:01,979
we always have these issues
of cache invalidation.

1583
01:09:01,979 --> 01:09:05,430
What happens when data
inside of the database

1584
01:09:05,430 --> 01:09:07,979
is more recent than data
inside of the cache?

1585
01:09:07,979 --> 01:09:09,670
How do we deal with that type of thing?

1586
01:09:09,670 --> 01:09:11,350
And so multiple ways
that we could do that.

1587
01:09:11,350 --> 01:09:12,933
How could we deal with that situation?

1588
01:09:12,933 --> 01:09:16,439
What are some ideas for
how to deal with a problem

1589
01:09:16,439 --> 01:09:20,069
where maybe the 10 most
recent, most popular books

1590
01:09:20,069 --> 01:09:23,880
are no longer valid because a bunch
of people bought book number 11,

1591
01:09:23,880 --> 01:09:26,340
and now that's the new
10th most popular book?

1592
01:09:26,340 --> 01:09:30,170
And so what's in the
cache is no longer valid.

1593
01:09:30,170 --> 01:09:30,670
Strategies?

1594
01:09:30,670 --> 01:09:31,420
There are multiple.

1595
01:09:31,420 --> 01:09:31,920
Yeah.

1596
01:09:31,920 --> 01:09:34,636
AUDIENCE: So you don't care
about the recent database.

1597
01:09:34,636 --> 01:09:36,594
But if someone writes in
the database, then you

1598
01:09:36,594 --> 01:09:38,170
can update the cache [INAUDIBLE].

1599
01:09:38,170 --> 01:09:38,439
BRIAN YU: Great.

1600
01:09:38,439 --> 01:09:41,630
So we could add logic that says that
when we're writing to the database,

1601
01:09:41,630 --> 01:09:45,490
if we place a new order, then we should
also make sure that the cache gets

1602
01:09:45,490 --> 01:09:48,850
updated, that we invalidate any old
information in the cache, get rid of it

1603
01:09:48,850 --> 01:09:51,850
such that the next time the user makes
a request, we're doing that anew.

1604
01:09:51,850 --> 01:09:54,460


1605
01:09:54,460 --> 01:09:57,247
So depending on the system
that you have and depending

1606
01:09:57,247 --> 01:10:00,580
on what types of reads and writes you're
doing, that may or may not be feasible.

1607
01:10:00,580 --> 01:10:02,699
In the case of 10 most
popular books, you probably

1608
01:10:02,699 --> 01:10:04,740
don't want it such that
any time anyone purchases

1609
01:10:04,740 --> 01:10:07,939
a book that we're invalidating the
cache of what the 10 most popular are.

1610
01:10:07,939 --> 01:10:09,730
But, certainly, you
can think of heuristics

1611
01:10:09,730 --> 01:10:12,910
that we might employ in order to
help make that process easier.

1612
01:10:12,910 --> 01:10:16,120
How else might we implement
cache invalidation, this idea

1613
01:10:16,120 --> 01:10:20,499
that if we have data in the cache,
then at some point that data

1614
01:10:20,499 --> 01:10:21,790
is no longer going to be valid?

1615
01:10:21,790 --> 01:10:26,263


1616
01:10:26,263 --> 01:10:29,742
AUDIENCE: Couldn't you do a
similar thing with the ID?

1617
01:10:29,742 --> 01:10:33,718
Like, the cache could store an
ID for a particular set of data.

1618
01:10:33,718 --> 01:10:37,694
And so then when somebody
requests that data from the cache,

1619
01:10:37,694 --> 01:10:41,774
it checks the database to see if
it needs to get an updated version.

1620
01:10:41,774 --> 01:10:42,440
BRIAN YU: Great.

1621
01:10:42,440 --> 01:10:46,282
So we could have some mechanism via
which the cache is checking the data--

1622
01:10:46,282 --> 01:10:47,990
or we have something
in the server that's

1623
01:10:47,990 --> 01:10:51,320
checking the database to see do we
actually need to check the database.

1624
01:10:51,320 --> 01:10:54,529
Or can we just go from the cache by
having some identifier that updates,

1625
01:10:54,529 --> 01:10:55,070
for instance?

1626
01:10:55,070 --> 01:10:58,230
And maybe that operation is slightly
less expensive on the database.

1627
01:10:58,230 --> 01:11:00,200
So it's not needing to
perform that full query,

1628
01:11:00,200 --> 01:11:01,670
but we still do a quick
check to see if there's

1629
01:11:01,670 --> 01:11:03,150
anything we might need to change.

1630
01:11:03,150 --> 01:11:04,816
Otherwise, we can still go to the cache.

1631
01:11:04,816 --> 01:11:06,108
That's certainly an option too.

1632
01:11:06,108 --> 01:11:08,607
And there are many other ways
that we could potentially deal

1633
01:11:08,607 --> 01:11:10,430
with the problem of cache invalidation.

1634
01:11:10,430 --> 01:11:13,160
One common way is just to
effectively ignore the problem.

1635
01:11:13,160 --> 01:11:16,370
Set an expiration time on the cache
and say, the 10 most popular books,

1636
01:11:16,370 --> 01:11:19,090
this will expire after
12 hours, for instance.

1637
01:11:19,090 --> 01:11:22,910
And it's probably not a big deal
if a new book comes in the top 10,

1638
01:11:22,910 --> 01:11:24,260
and it's not updated right away.

1639
01:11:24,260 --> 01:11:26,870
If you don't care that much and
you're OK with the cache being

1640
01:11:26,870 --> 01:11:29,060
a little bit out of date,
then that's OK so long

1641
01:11:29,060 --> 01:11:31,880
as you have some sort of expiration
time on the cache to say,

1642
01:11:31,880 --> 01:11:34,370
after X number of
minutes or hours or days,

1643
01:11:34,370 --> 01:11:37,430
then we should invalidate it and
then check the database again

1644
01:11:37,430 --> 01:11:39,260
to see what the latest information is.

1645
01:11:39,260 --> 01:11:42,390
And so the big takeaway
from all of this is--

1646
01:11:42,390 --> 01:11:46,790
whether we're talking about caching or
whether we're talking about database

1647
01:11:46,790 --> 01:11:51,350
scalability in terms of partitioning it
or replicating it into different places

1648
01:11:51,350 --> 01:11:54,450
or we're thinking about how we're
going to load balance our servers,

1649
01:11:54,450 --> 01:11:58,550
whether we're using vertical, whether
we're expanding them vertically

1650
01:11:58,550 --> 01:12:01,592
or scaling them horizontally or some
combination of all of these things--

1651
01:12:01,592 --> 01:12:03,883
that there are trade-offs
with each of those decisions.

1652
01:12:03,883 --> 01:12:07,190
And we have to decide whether, based
on the needs of our particular system,

1653
01:12:07,190 --> 01:12:09,740
based on the needs of our particular
web application, whether there are more

1654
01:12:09,740 --> 01:12:12,073
writes than there are reads
and what sorts of operations

1655
01:12:12,073 --> 01:12:14,920
are commonly happening, that we
need to make these design decisions.

1656
01:12:14,920 --> 01:12:18,800
And so one of the goals today
was really to get across the idea

1657
01:12:18,800 --> 01:12:21,290
that, with all of these
different moving parts,

1658
01:12:21,290 --> 01:12:24,680
we can be thinking critically about
what design decisions we make,

1659
01:12:24,680 --> 01:12:26,930
how we choose to design
our system in a way such

1660
01:12:26,930 --> 01:12:31,280
that it is scalable based on the
specific needs of our system or our web

1661
01:12:31,280 --> 01:12:32,510
application.

1662
01:12:32,510 --> 01:12:35,152
Questions about any of
those things so far?

1663
01:12:35,152 --> 01:12:36,015
Yeah.

1664
01:12:36,015 --> 01:12:36,640
AUDIENCE: Yeah.

1665
01:12:36,640 --> 01:12:38,624
I have a question about,
basically, the cache.

1666
01:12:38,624 --> 01:12:41,104
Shouldn't the cache be
memory on the server?

1667
01:12:41,104 --> 01:12:44,149
Or is it actually its
own hardware [INAUDIBLE]??

1668
01:12:44,149 --> 01:12:45,190
BRIAN YU: Great question.

1669
01:12:45,190 --> 01:12:47,190
So what form does the
cache actually take?

1670
01:12:47,190 --> 01:12:49,010
So, certainly, there are
a bunch of different forms

1671
01:12:49,010 --> 01:12:50,051
that that cache can take.

1672
01:12:50,051 --> 01:12:52,289
And oftentimes we might
have a smaller cache

1673
01:12:52,289 --> 01:12:55,080
that's actually physically located
on the server, where we wouldn't

1674
01:12:55,080 --> 01:12:57,270
need to talk to something external.

1675
01:12:57,270 --> 01:13:00,850
But there are other cases where
you might want an external cache.

1676
01:13:00,850 --> 01:13:04,530
Well, what's one benefit that an
external cache does give you instead

1677
01:13:04,530 --> 01:13:07,990
of just storing a cache on one server?

1678
01:13:07,990 --> 01:13:09,607
AUDIENCE: More space [INAUDIBLE].

1679
01:13:09,607 --> 01:13:10,940
BRIAN YU: More space, certainly.

1680
01:13:10,940 --> 01:13:14,100
So a cache might be able to
store large amounts of data.

1681
01:13:14,100 --> 01:13:17,090
And usually a cache is just
going to be, basically, hard

1682
01:13:17,090 --> 01:13:21,750
disk storage where you can just easily
access it, access it very quickly.

1683
01:13:21,750 --> 01:13:23,810
Amazon Web Services,
for instance, offers

1684
01:13:23,810 --> 01:13:25,691
S3, which is effectively
a service that's

1685
01:13:25,691 --> 01:13:28,190
just a big hard drive in the
cloud where you can store files

1686
01:13:28,190 --> 01:13:30,710
and is often used for caching purposes.

1687
01:13:30,710 --> 01:13:34,970
What's another benefit of just storing,
of using an external cache located

1688
01:13:34,970 --> 01:13:37,490
on some separate hard
drive somewhere that's not

1689
01:13:37,490 --> 01:13:40,939
within any one of the servers
but that all the servers talk to?

1690
01:13:40,939 --> 01:13:42,980
So we talked about the
drawback, which is that it

1691
01:13:42,980 --> 01:13:44,390
takes longer to talk to the cache.

1692
01:13:44,390 --> 01:13:45,264
But what's a benefit?

1693
01:13:45,264 --> 01:13:46,975
AUDIENCE: One primary source.

1694
01:13:46,975 --> 01:13:47,600
BRIAN YU: Yeah.

1695
01:13:47,600 --> 01:13:49,474
It's a primary source
that all of the servers

1696
01:13:49,474 --> 01:13:53,000
have access to, that if server
number one has cached the 10 most

1697
01:13:53,000 --> 01:13:56,060
popular books, that if someone's on
server two and tries to access 10

1698
01:13:56,060 --> 01:13:59,030
most popular books, they can also
access that same cache as opposed

1699
01:13:59,030 --> 01:14:02,350
to in the case where if all the
caches are only stored on the servers,

1700
01:14:02,350 --> 01:14:05,660
now each of those servers needs to
independently generate it and maintain

1701
01:14:05,660 --> 01:14:06,620
its own cache.

1702
01:14:06,620 --> 01:14:08,030
And it has those issues as well.

1703
01:14:08,030 --> 01:14:10,782
In reality, though,
most web applications

1704
01:14:10,782 --> 01:14:13,490
that begin to scale larger and
larger have many different caches.

1705
01:14:13,490 --> 01:14:16,615
They will have caches on the server
for quicker things that need to happen.

1706
01:14:16,615 --> 01:14:20,150
And maybe you're using a
separate cache in order

1707
01:14:20,150 --> 01:14:23,030
to deal with larger files that
need to be cache, for instance.

1708
01:14:23,030 --> 01:14:25,790
And they'll use some combination
of all these caching techniques

1709
01:14:25,790 --> 01:14:28,430
in order to get the best
of both worlds, ideally,

1710
01:14:28,430 --> 01:14:30,680
to try and have quick access
on the server to things

1711
01:14:30,680 --> 01:14:35,570
that we need access to quickly, to
store on the off-site cache information

1712
01:14:35,570 --> 01:14:37,880
that we want all the servers
to be able to access.

1713
01:14:37,880 --> 01:14:40,588
And so usually you'll see some
combination of all of these things

1714
01:14:40,588 --> 01:14:44,000
in practice as real web
applications begin to scale.

1715
01:14:44,000 --> 01:14:44,500
Yep.

1716
01:14:44,500 --> 01:14:45,583
AUDIENCE: Question online.

1717
01:14:45,583 --> 01:14:48,820
What would be a good way for estimating
the number of servers, databases,

1718
01:14:48,820 --> 01:14:51,629
load balancers, and caches that
you would need for an application?

1719
01:14:51,629 --> 01:14:52,670
BRIAN YU: Great question.

1720
01:14:52,670 --> 01:14:56,030
So what's a good way to estimate the
amount that you would actually need?

1721
01:14:56,030 --> 01:14:58,155
And so we talked a little
bit at the very beginning

1722
01:14:58,155 --> 01:15:00,260
about benchmarking, about
the process of trying

1723
01:15:00,260 --> 01:15:03,410
to test to see how much load
the server can actually take.

1724
01:15:03,410 --> 01:15:05,960
And so there are a number of
different pieces of software

1725
01:15:05,960 --> 01:15:08,600
that you can use in order to
perform that benchmarking.

1726
01:15:08,600 --> 01:15:13,310
I know ApacheBench, I believe, is
one common piece of load balancing,

1727
01:15:13,310 --> 01:15:15,890
of benchmarking software
that you can use.

1728
01:15:15,890 --> 01:15:18,680
And, also, one good strategy is,
if you're using cloud computing,

1729
01:15:18,680 --> 01:15:22,040
look to the documentation of wherever
you're getting those servers from.

1730
01:15:22,040 --> 01:15:25,190
And they'll likely include
information about the processing power

1731
01:15:25,190 --> 01:15:26,840
and the memory of those computers.

1732
01:15:26,840 --> 01:15:31,880
And so in the AWS case, for instance,
one of the more popular servers tools

1733
01:15:31,880 --> 01:15:35,090
is EC2, Elastic Compute Cloud,
which is just the service

1734
01:15:35,090 --> 01:15:39,050
that AWS offers that lets you
effectively rent servers in the cloud

1735
01:15:39,050 --> 01:15:42,170
and run servers like these that might
be connected to a load balancer,

1736
01:15:42,170 --> 01:15:43,000
for instance.

1737
01:15:43,000 --> 01:15:45,420
And they come in different sizes.

1738
01:15:45,420 --> 01:15:47,930
They have different names,
different letters and numbers,

1739
01:15:47,930 --> 01:15:49,388
like smalls and mediums and larges.

1740
01:15:49,388 --> 01:15:52,880
And you can look on their website as
to what each one of those servers means

1741
01:15:52,880 --> 01:15:54,670
and how much computing power it has.

1742
01:15:54,670 --> 01:15:57,432
And using that, you can begin
to gauge which one you need

1743
01:15:57,432 --> 01:15:59,015
in order for your particular purposes.

1744
01:15:59,015 --> 01:16:01,810


1745
01:16:01,810 --> 01:16:02,310
All right.

1746
01:16:02,310 --> 01:16:04,190
So those were just-- that
was just a brief introduction

1747
01:16:04,190 --> 01:16:06,981
to some of the high-level concepts
that come into play as you think

1748
01:16:06,981 --> 01:16:08,617
about how to scale your application.

1749
01:16:08,617 --> 01:16:11,450
When I come back next time, we'll
be talking about security, about--

1750
01:16:11,450 --> 01:16:12,116
in the same way.

1751
01:16:12,116 --> 01:16:15,972
As we take our applications and
begin to scale them to deploy them

1752
01:16:15,972 --> 01:16:18,180
to the internet and are used
by many different users,

1753
01:16:18,180 --> 01:16:22,186
how do we make sure that our
software is secure from adversaries

1754
01:16:22,186 --> 01:16:24,560
that might be trying to attack
the website, for instance?

1755
01:16:24,560 --> 01:16:26,390
And what considerations go into there?

1756
01:16:26,390 --> 01:16:29,348
And so that's it for today, for Web
Program with Python and JavaScript.

1757
01:16:29,348 --> 01:16:31,090
Thank you all so much.

1758
01:16:31,090 --> 01:16:32,255