1
00:00:00,000 --> 00:00:05,830

2
00:00:05,830 --> 00:00:07,910
>> All right, so, computational complexity.

3
00:00:07,910 --> 00:00:10,430
Just a bit of a warning
before we dive in too far--

4
00:00:10,430 --> 00:00:13,070
this'll probably be among
the most math-heavy things

5
00:00:13,070 --> 00:00:14,200
we talk about in CS50.

6
00:00:14,200 --> 00:00:16,950
Hopefully it won't be too overwhelming
and we'll try and guide you

7
00:00:16,950 --> 00:00:19,200
through the process, but
just a bit of a fair warning.

8
00:00:19,200 --> 00:00:21,282
There's a little bit
of math involved here.

9
00:00:21,282 --> 00:00:23,990
All right, so in order to make
use of our computational resources

10
00:00:23,990 --> 00:00:28,170
in the real world-- it's really
important to understand algorithms

11
00:00:28,170 --> 00:00:30,750
and how they process data.

12
00:00:30,750 --> 00:00:32,810
If we have a really
efficient algorithm, we

13
00:00:32,810 --> 00:00:36,292
can minimize the amount of resources
we have available to deal with it.

14
00:00:36,292 --> 00:00:38,750
If we have an algorithm that
is going to take a lot of work

15
00:00:38,750 --> 00:00:41,210
to process a really
large set of data, it's

16
00:00:41,210 --> 00:00:44,030
going to require more
and more resources, which

17
00:00:44,030 --> 00:00:47,980
is money, RAM, all that kind of stuff.

18
00:00:47,980 --> 00:00:52,090
>> So, being able to analyze an
algorithm using this tool set,

19
00:00:52,090 --> 00:00:56,110
basically, asks the question--
how does this algorithm scale

20
00:00:56,110 --> 00:00:59,020
as we throw more and more data at it?

21
00:00:59,020 --> 00:01:02,220
In CS50, the amount of data we're
working with is pretty small.

22
00:01:02,220 --> 00:01:05,140
Generally, our programs are going
to run in a second or less--

23
00:01:05,140 --> 00:01:07,830
probably a lot less
particularly early on.

24
00:01:07,830 --> 00:01:12,250
>> But think about a company that deals
with hundreds of millions of customers.

25
00:01:12,250 --> 00:01:14,970
And they need to process
that customer data.

26
00:01:14,970 --> 00:01:18,260
As the number of customers they
have gets bigger and bigger,

27
00:01:18,260 --> 00:01:21,230
it's going to require
more and more resources.

28
00:01:21,230 --> 00:01:22,926
How many more resources?

29
00:01:22,926 --> 00:01:25,050
Well, that depends on how
we analyze the algorithm,

30
00:01:25,050 --> 00:01:28,097
using the tools in this toolbox.

31
00:01:28,097 --> 00:01:31,180
When we talk about the complexity of
an algorithm-- which sometimes you'll

32
00:01:31,180 --> 00:01:34,040
hear it referred to as time
complexity or space complexity

33
00:01:34,040 --> 00:01:36,190
but we're just going
to call complexity--

34
00:01:36,190 --> 00:01:38,770
we're generally talking about
the worst-case scenario.

35
00:01:38,770 --> 00:01:42,640
Given the absolute worst pile of
data that we could be throwing at it,

36
00:01:42,640 --> 00:01:46,440
how is this algorithm going to
process or deal with that data?

37
00:01:46,440 --> 00:01:51,450
We generally call the worst-case
runtime of an algorithm big-O.

38
00:01:51,450 --> 00:01:56,770
So an algorithm might be said to
run in O of n or O of n squared.

39
00:01:56,770 --> 00:01:59,110
And more about what
those mean in a second.

40
00:01:59,110 --> 00:02:01,620
>> Sometimes, though, we do care
about the best-case scenario.

41
00:02:01,620 --> 00:02:05,400
If the data is everything we wanted
it to be and it was absolutely perfect

42
00:02:05,400 --> 00:02:09,630
and we were sending this perfect
set of data through our algorithm.

43
00:02:09,630 --> 00:02:11,470
How would it handle in that situation?

44
00:02:11,470 --> 00:02:15,050
We sometimes refer to that as
big-Omega, so in contrast with big-O,

45
00:02:15,050 --> 00:02:16,530
we have big-Omega.

46
00:02:16,530 --> 00:02:18,880
Big-Omega for the best-case scenario.

47
00:02:18,880 --> 00:02:21,319
Big-O for the worst-case scenario.

48
00:02:21,319 --> 00:02:23,860
Generally, when we talk about
the complexity of an algorithm,

49
00:02:23,860 --> 00:02:26,370
we're talking about the
worst-case scenario.

50
00:02:26,370 --> 00:02:28,100
So keep that in mind.

51
00:02:28,100 --> 00:02:31,510
>> And in this class, we're generally going
to leave the rigorous analysis aside.

52
00:02:31,510 --> 00:02:35,350
There are sciences and fields
devoted to this kind of stuff.

53
00:02:35,350 --> 00:02:37,610
When we talk about reasoning
through algorithms,

54
00:02:37,610 --> 00:02:41,822
which we'll do piece-by-piece for many
algorithms we talk about in the class.

55
00:02:41,822 --> 00:02:44,780
We're really just talking about
reasoning through it with common sense,

56
00:02:44,780 --> 00:02:47,070
not with formulas, or proofs,
or anything like that.

57
00:02:47,070 --> 00:02:51,600
So don't worry, We won't be
turning into a big math class.

58
00:02:51,600 --> 00:02:55,920
>> So I said we care about complexity
because it asks the question, how

59
00:02:55,920 --> 00:03:00,160
do our algorithms handle larger and
larger data sets being thrown at them.

60
00:03:00,160 --> 00:03:01,960
Well, what is a data set?

61
00:03:01,960 --> 00:03:03,910
What did I mean when I said that?

62
00:03:03,910 --> 00:03:07,600
It means whatever makes the most
sense in context, to be honest.

63
00:03:07,600 --> 00:03:11,160
If we have an algorithm, the
Processes Strings-- we're probably

64
00:03:11,160 --> 00:03:13,440
talking about the size of the string.

65
00:03:13,440 --> 00:03:15,190
That's the data set--
the size, the number

66
00:03:15,190 --> 00:03:17,050
of characters that make up the string.

67
00:03:17,050 --> 00:03:20,090
If we're talking about an
algorithm that processes files,

68
00:03:20,090 --> 00:03:23,930
we might be talking about how
many kilobytes comprise that file.

69
00:03:23,930 --> 00:03:25,710
And that's the data set.

70
00:03:25,710 --> 00:03:28,870
If we're talking about an algorithm
that handles arrays more generally,

71
00:03:28,870 --> 00:03:31,510
such as sorting algorithms
or searching algorithms,

72
00:03:31,510 --> 00:03:36,690
we're probably talking about the number
of elements that comprise an array.

73
00:03:36,690 --> 00:03:39,272
>> Now, we can measure an
algorithm-- in particular,

74
00:03:39,272 --> 00:03:40,980
when I say we can
measure an algorithm, I

75
00:03:40,980 --> 00:03:43,840
mean we can measure how
many resources it takes up.

76
00:03:43,840 --> 00:03:48,990
Whether those resources are, how many
bytes of RAM-- or megabytes of RAM

77
00:03:48,990 --> 00:03:49,790
it uses.

78
00:03:49,790 --> 00:03:52,320
Or how much time it takes to run.

79
00:03:52,320 --> 00:03:56,200
And we can call this
measure, arbitrarily, f of n.

80
00:03:56,200 --> 00:03:59,420
Where n is the number of
elements in the data set.

81
00:03:59,420 --> 00:04:02,640
And f of n is how many somethings.

82
00:04:02,640 --> 00:04:07,530
How many units of resources does
it require to process that data.

83
00:04:07,530 --> 00:04:10,030
>> Now, we actually don't care
about what f of n is exactly.

84
00:04:10,030 --> 00:04:13,700
In fact, we very rarely will--
certainly will never in this class-- I

85
00:04:13,700 --> 00:04:18,709
dive into any really deep
analysis of what f of n is.

86
00:04:18,709 --> 00:04:23,510
We're just going to talk about what f of
n is approximately or what it tends to.

87
00:04:23,510 --> 00:04:27,600
And the tendency of an algorithm is
dictated by its highest order term.

88
00:04:27,600 --> 00:04:29,440
And we can see what I
mean by that by taking

89
00:04:29,440 --> 00:04:31,910
a look at a more concrete example.

90
00:04:31,910 --> 00:04:34,620
>> So let's say that we have
three different algorithms.

91
00:04:34,620 --> 00:04:39,350
The first of which takes n
cubed, some units of resources

92
00:04:39,350 --> 00:04:42,880
to process a data set of size n.

93
00:04:42,880 --> 00:04:47,000
We have a second algorithm that takes
n cubed plus n squared resources

94
00:04:47,000 --> 00:04:49,350
to process a data set of size n.

95
00:04:49,350 --> 00:04:52,030
And we have a third
algorithm that runs in-- that

96
00:04:52,030 --> 00:04:58,300
takes up n cubed minus 8n squared
plus 20 n units of resources

97
00:04:58,300 --> 00:05:02,370
to process an algorithm
with data set of size n.

98
00:05:02,370 --> 00:05:05,594
>> Now again, we really aren't going
to get into this level of detail.

99
00:05:05,594 --> 00:05:08,260
I'm really just have these up
here as an illustration of a point

100
00:05:08,260 --> 00:05:10,176
that I'm going to be
making in a second, which

101
00:05:10,176 --> 00:05:12,980
is that we only really care
about the tendency of things

102
00:05:12,980 --> 00:05:14,870
as the data sets get bigger.

103
00:05:14,870 --> 00:05:18,220
So if the data set is small, there's
actually a pretty big difference

104
00:05:18,220 --> 00:05:19,870
in these algorithms.

105
00:05:19,870 --> 00:05:23,000
The third algorithm there
takes 13 times longer,

106
00:05:23,000 --> 00:05:27,980
13 times the amount of resources
to run relative to the first one.

107
00:05:27,980 --> 00:05:31,659
>> If our data set is size 10, which
is bigger, but not necessarily huge,

108
00:05:31,659 --> 00:05:33,950
we can see that there's
actually a bit of a difference.

109
00:05:33,950 --> 00:05:36,620
The third algorithm
becomes more efficient.

110
00:05:36,620 --> 00:05:40,120
It's about actually 40%--
or 60% more efficient.

111
00:05:40,120 --> 00:05:41,580
It takes 40% the amount of time.

112
00:05:41,580 --> 00:05:45,250
It can run-- it can take
400 units of resources

113
00:05:45,250 --> 00:05:47,570
to process a data set of size 10.

114
00:05:47,570 --> 00:05:49,410
Whereas the first
algorithm, by contrast,

115
00:05:49,410 --> 00:05:54,520
takes 1,000 units of resources
to process a data set of size 10.

116
00:05:54,520 --> 00:05:57,240
But look what happens as
our numbers get even bigger.

117
00:05:57,240 --> 00:05:59,500
>> Now, the difference
between these algorithms

118
00:05:59,500 --> 00:06:01,420
start to become a little less apparent.

119
00:06:01,420 --> 00:06:04,560
And the fact that there are
lower-order terms-- or rather,

120
00:06:04,560 --> 00:06:09,390
terms with lower exponents--
start to become irrelevant.

121
00:06:09,390 --> 00:06:12,290
If a data set is of size
1,000 and the first algorithm

122
00:06:12,290 --> 00:06:14,170
runs in a billion steps.

123
00:06:14,170 --> 00:06:17,880
And the second algorithm runs in
a billion and a million steps.

124
00:06:17,880 --> 00:06:20,870
And the third algorithm runs
in just shy of a billion steps.

125
00:06:20,870 --> 00:06:22,620
It's pretty much a billion steps.

126
00:06:22,620 --> 00:06:25,640
Those lower-order terms start
to become really irrelevant.

127
00:06:25,640 --> 00:06:27,390
And just to really
hammer home the point--

128
00:06:27,390 --> 00:06:31,240
if the data input is of size a
million-- all three of these pretty much

129
00:06:31,240 --> 00:06:34,960
take one quintillion-- if
my math is correct-- steps

130
00:06:34,960 --> 00:06:37,260
to process a data input
of size a million.

131
00:06:37,260 --> 00:06:38,250
That's a lot of steps.

132
00:06:38,250 --> 00:06:42,092
And the fact that one of them might
take a couple 100,000, or a couple 100

133
00:06:42,092 --> 00:06:44,650
million even less when
we're talking about a number

134
00:06:44,650 --> 00:06:46,990
that big-- it's kind of irrelevant.

135
00:06:46,990 --> 00:06:50,006
They all tend to take
approximately n cubed,

136
00:06:50,006 --> 00:06:52,380
and so we would actually refer
to all of these algorithms

137
00:06:52,380 --> 00:06:59,520
as being on the order of n
cubed or big-O of n cubed.

138
00:06:59,520 --> 00:07:03,220
>> Here's a list of some of the more
common computational complexity classes

139
00:07:03,220 --> 00:07:05,820
that we'll encounter in
algorithms, generally.

140
00:07:05,820 --> 00:07:07,970
And also specifically in CS50.

141
00:07:07,970 --> 00:07:11,410
These are ordered from
generally fastest at the top,

142
00:07:11,410 --> 00:07:13,940
to generally slowest at the bottom.

143
00:07:13,940 --> 00:07:16,920
So constant time algorithms tend
to be the fastest, regardless

144
00:07:16,920 --> 00:07:19,110
of the size of the
data input you pass in.

145
00:07:19,110 --> 00:07:23,760
They always take one operation or
one unit of resources to deal with.

146
00:07:23,760 --> 00:07:25,730
It might be 2, it might
be 3, it might be 4.

147
00:07:25,730 --> 00:07:26,910
But it's a constant number.

148
00:07:26,910 --> 00:07:28,400
It doesn't vary.

149
00:07:28,400 --> 00:07:31,390
>> Logarithmic time algorithms
are slightly better.

150
00:07:31,390 --> 00:07:33,950
And a really good example of
a logarithmic time algorithm

151
00:07:33,950 --> 00:07:37,420
you've surely seen by now is the
tearing apart of the phone book

152
00:07:37,420 --> 00:07:39,480
to find Mike Smith in the phone book.

153
00:07:39,480 --> 00:07:40,980
We cut the problem in half.

154
00:07:40,980 --> 00:07:43,580
And so as n gets larger
and larger and larger--

155
00:07:43,580 --> 00:07:47,290
in fact, every time you double
n, it only takes one more step.

156
00:07:47,290 --> 00:07:49,770
So that's a lot better
than, say, linear time.

157
00:07:49,770 --> 00:07:53,030
Which is if you double n, it
takes double the number of steps.

158
00:07:53,030 --> 00:07:55,980
If you triple n, it takes
triple the number of steps.

159
00:07:55,980 --> 00:07:58,580
One step per unit.

160
00:07:58,580 --> 00:08:01,790
>> Then things get a little more--
little less great from there.

161
00:08:01,790 --> 00:08:06,622
You have linear rhythmic time, sometimes
called log linear time or just n log n.

162
00:08:06,622 --> 00:08:08,330
And we'll an example
of an algorithm that

163
00:08:08,330 --> 00:08:13,370
runs in n log n, which is still better
than quadratic time-- n squared.

164
00:08:13,370 --> 00:08:17,320
Or polynomial time, n two
any number greater than two.

165
00:08:17,320 --> 00:08:20,810
Or exponential time, which
is even worse-- C to the n.

166
00:08:20,810 --> 00:08:24,670
So some constant number raised to
the power of the size of the input.

167
00:08:24,670 --> 00:08:28,990
So if there's 1,000-- if the
data input is of size 1,000,

168
00:08:28,990 --> 00:08:31,310
it would take C to the 1,000th power.

169
00:08:31,310 --> 00:08:33,559
It's a lot worse than polynomial time.

170
00:08:33,559 --> 00:08:35,030
>> Factorial time is even worse.

171
00:08:35,030 --> 00:08:37,760
And in fact, there really do
exist infinite time algorithms,

172
00:08:37,760 --> 00:08:43,740
such as, so-called stupid sort-- whose
job is to randomly shuffle an array

173
00:08:43,740 --> 00:08:45,490
and then check to see
whether it's sorted.

174
00:08:45,490 --> 00:08:47,589
And if it's not, randomly
shuffle the array again

175
00:08:47,589 --> 00:08:49,130
and check to see whether it's sorted.

176
00:08:49,130 --> 00:08:51,671
And as you can probably imagine--
you can imagine a situation

177
00:08:51,671 --> 00:08:55,200
where in the worst-case, that will
never actually start with the array.

178
00:08:55,200 --> 00:08:57,150
That algorithm would run forever.

179
00:08:57,150 --> 00:08:59,349
And so that would be an
infinite time algorithm.

180
00:08:59,349 --> 00:09:01,890
Hopefully you won't be writing
any factorial or infinite time

181
00:09:01,890 --> 00:09:04,510
algorithms in CS50.

182
00:09:04,510 --> 00:09:09,150
>> So, let's take a little more
concrete look at some simpler

183
00:09:09,150 --> 00:09:11,154
computational complexity classes.

184
00:09:11,154 --> 00:09:13,070
So we have an example--
or two examples here--

185
00:09:13,070 --> 00:09:15,590
of constant time algorithms,
which always take

186
00:09:15,590 --> 00:09:17,980
a single operation in the worst-case.

187
00:09:17,980 --> 00:09:20,050
So the first example--
we have a function

188
00:09:20,050 --> 00:09:23,952
called 4 for you, which
takes an array of size 1,000.

189
00:09:23,952 --> 00:09:25,660
But then apparently
doesn't actually look

190
00:09:25,660 --> 00:09:29,000
at it-- doesn't really care what's
inside of it, of that array.

191
00:09:29,000 --> 00:09:30,820
Always just returns four.

192
00:09:30,820 --> 00:09:32,940
So, that algorithm,
despite the fact that it

193
00:09:32,940 --> 00:09:35,840
takes 1,000 elements doesn't
do anything with them.

194
00:09:35,840 --> 00:09:36,590
Just returns four.

195
00:09:36,590 --> 00:09:38,240
It's always a single step.

196
00:09:38,240 --> 00:09:41,600
>> In fact, add 2 nums-- which
we've seen before as well--

197
00:09:41,600 --> 00:09:43,680
just processes two integers.

198
00:09:43,680 --> 00:09:44,692
It's not a single step.

199
00:09:44,692 --> 00:09:45,900
It's actually a couple steps.

200
00:09:45,900 --> 00:09:50,780
You get a, you get b, you add them
together, and you output the results.

201
00:09:50,780 --> 00:09:53,270
So it's 84 steps.

202
00:09:53,270 --> 00:09:55,510
But it's always constant,
regardless of a or b.

203
00:09:55,510 --> 00:09:59,240
You have to get a, get b, add
them together, output the result.

204
00:09:59,240 --> 00:10:02,900
So that's a constant time algorithm.

205
00:10:02,900 --> 00:10:05,170
>> Here's an example of a
linear time algorithm--

206
00:10:05,170 --> 00:10:08,740
an algorithm that gets-- that takes
one additional step, possibly,

207
00:10:08,740 --> 00:10:10,740
as your input grows by 1.

208
00:10:10,740 --> 00:10:14,190
So, let's say we're looking for
the number 5 inside of an array.

209
00:10:14,190 --> 00:10:16,774
You might have a situation where
you can find it fairly early.

210
00:10:16,774 --> 00:10:18,606
But you could also have
a situation where it

211
00:10:18,606 --> 00:10:20,450
might be the last element of the array.

212
00:10:20,450 --> 00:10:23,780
In an array of size 5, if
we're looking for the number 5.

213
00:10:23,780 --> 00:10:25,930
It would take 5 steps.

214
00:10:25,930 --> 00:10:29,180
And in fact, imagine that there's
not 5 anywhere in this array.

215
00:10:29,180 --> 00:10:32,820
We still actually have to look at
every single element of the array

216
00:10:32,820 --> 00:10:35,550
in order to determine
whether or not 5 is there.

217
00:10:35,550 --> 00:10:39,840
>> So in the worst-case, which is that
the element is last in the array

218
00:10:39,840 --> 00:10:41,700
or doesn't exist at all.

219
00:10:41,700 --> 00:10:44,690
We still have to look at
all of the n elements.

220
00:10:44,690 --> 00:10:47,120
And so this algorithm
runs in linear time.

221
00:10:47,120 --> 00:10:50,340
You can confirm that by
extrapolating a little bit by saying,

222
00:10:50,340 --> 00:10:53,080
if we had a 6-element array and
we were looking for the number 5,

223
00:10:53,080 --> 00:10:54,890
it might take 6 steps.

224
00:10:54,890 --> 00:10:57,620
If we have a 7-element array and
we're looking for the number 5.

225
00:10:57,620 --> 00:10:59,160
It might take 7 steps.

226
00:10:59,160 --> 00:11:02,920
As we add one more element to our
array, it takes one more step.

227
00:11:02,920 --> 00:11:06,750
That's a linear algorithm
in the worst-case.

228
00:11:06,750 --> 00:11:08,270
>> Couple quick questions for you.

229
00:11:08,270 --> 00:11:11,170
What's the runtime-- what's
the worst-case runtime

230
00:11:11,170 --> 00:11:13,700
of this particular snippet of code?

231
00:11:13,700 --> 00:11:17,420
So I have a 4 loop here that runs
from j equals 0, all the way up to m.

232
00:11:17,420 --> 00:11:21,980
And what I'm seeing here, is that the
body of the loop runs in constant time.

233
00:11:21,980 --> 00:11:24,730
So using the terminology that
we've already talked about-- what

234
00:11:24,730 --> 00:11:29,390
would be the worst-case
runtime of this algorithm?

235
00:11:29,390 --> 00:11:31,050
Take a second.

236
00:11:31,050 --> 00:11:34,270
The inner part of the loop
runs in constant time.

237
00:11:34,270 --> 00:11:37,510
And the outer part of the
loop is going to run m times.

238
00:11:37,510 --> 00:11:40,260
So what's the worst-case runtime here?

239
00:11:40,260 --> 00:11:43,210
Did you guess big-O of m?

240
00:11:43,210 --> 00:11:44,686
You'd be right.

241
00:11:44,686 --> 00:11:46,230
>> How about another one?

242
00:11:46,230 --> 00:11:48,590
This time we have a
loop inside of a loop.

243
00:11:48,590 --> 00:11:50,905
We have an outer loop
that runs from zero to p.

244
00:11:50,905 --> 00:11:54,630
And we have an inner loop that runs
from zero to p, and inside of that,

245
00:11:54,630 --> 00:11:57,890
I state that the body the
loop runs in constant time.

246
00:11:57,890 --> 00:12:02,330
So what's the worst-case runtime
of this particular snippet of code?

247
00:12:02,330 --> 00:12:06,140
Well, again, we have an
outer loop that runs p times.

248
00:12:06,140 --> 00:12:09,660
And each time-- iteration
of that loop, rather.

249
00:12:09,660 --> 00:12:13,170
We have an inner loop
that also runs p times.

250
00:12:13,170 --> 00:12:16,900
And then inside of that, there's the
constant time-- little snippet there.

251
00:12:16,900 --> 00:12:19,890
>> So if we have an outer loop that
runs p times inside of which is

252
00:12:19,890 --> 00:12:22,880
an inner loop that
runs p times-- what is

253
00:12:22,880 --> 00:12:26,480
the worst-case runtime
of this snippet of code?

254
00:12:26,480 --> 00:12:30,730
Did you guess big-O of p squared?

255
00:12:30,730 --> 00:12:31,690
>> I'm Doug Lloyd.

256
00:12:31,690 --> 00:12:33,880
This is CS50.

257
00:12:33,880 --> 00:12:35,622