1
00:00:00,000 --> 00:00:02,994
[MUSIC PLAYING]

2
00:00:02,994 --> 00:00:19,217


3
00:00:19,217 --> 00:00:20,800
CARTER ZENKE: Well, hello one and all.

4
00:00:20,800 --> 00:00:23,200
And welcome back to CS50's
introduction to programming

5
00:00:23,200 --> 00:00:24,970
with R. My name is Carter Zenke.

6
00:00:24,970 --> 00:00:27,520
And this is our lecture
on testing programs.

7
00:00:27,520 --> 00:00:30,850
We'll see today all the ways
our programs could go wrong,

8
00:00:30,850 --> 00:00:32,860
how to handle these
things called errors,

9
00:00:32,860 --> 00:00:36,970
and see how to test our programs
to ensure they behave as we intend.

10
00:00:36,970 --> 00:00:40,090
So let's jump in and see all
the ways a function I've written

11
00:00:40,090 --> 00:00:41,890
could go a little bit wrong.

12
00:00:41,890 --> 00:00:46,660
I have here in RStudio a function
I've defined called average.

13
00:00:46,660 --> 00:00:51,280
And this function average is defined
in this file called average.R.

14
00:00:51,280 --> 00:00:55,900
And the purpose of this average function
is to take as input a vector of numbers

15
00:00:55,900 --> 00:00:59,440
and return to me the single
average number it finds across all

16
00:00:59,440 --> 00:01:01,330
of those numbers in that vector.

17
00:01:01,330 --> 00:01:06,340
So notice here how I'm using the
built in functions sum and length.

18
00:01:06,340 --> 00:01:08,890
And you might know if you're
familiar with averages or means

19
00:01:08,890 --> 00:01:12,220
that that's defined as basically
taking the sum of numbers you have

20
00:01:12,220 --> 00:01:14,770
and dividing by the number
of numbers you have.

21
00:01:14,770 --> 00:01:17,770
So it's exactly what I'm doing
here with sum and length.

22
00:01:17,770 --> 00:01:21,260
And let me go ahead and presume that
I figured that sum and length are

23
00:01:21,260 --> 00:01:22,190
correctly implemented.

24
00:01:22,190 --> 00:01:27,020
I can rely on these functions just as
well in my own function called average.

25
00:01:27,020 --> 00:01:30,050
Now, it turns out there is
already a function called mean,

26
00:01:30,050 --> 00:01:33,860
which does this very same thing
built into R. It turns to us the mean

27
00:01:33,860 --> 00:01:35,840
or the average of some set of numbers.

28
00:01:35,840 --> 00:01:38,780
But our goal today is
to write our own version

29
00:01:38,780 --> 00:01:40,730
of that function called average.

30
00:01:40,730 --> 00:01:42,560
So we can kind of see
the design decisions

31
00:01:42,560 --> 00:01:45,230
that went into writing
a function like mean

32
00:01:45,230 --> 00:01:47,990
in R. So here is my average function.

33
00:01:47,990 --> 00:01:51,140
Let's go ahead and try it and think
about what could go wrong, actually.

34
00:01:51,140 --> 00:01:53,330
So I said before that this
function average should

35
00:01:53,330 --> 00:01:56,570
take as input a vector of numbers.

36
00:01:56,570 --> 00:02:01,190
But we've seen some ways a user
could give us not numbers, but text.

37
00:02:01,190 --> 00:02:04,730
If you recall using readline, you
might know that readline by default

38
00:02:04,730 --> 00:02:07,160
takes as input text--
and hands it back text.

39
00:02:07,160 --> 00:02:10,340
So maybe I might have forgotten
to convert that text to a number.

40
00:02:10,340 --> 00:02:15,350
I could run average in my console here,
first defining it up above on line 1.

41
00:02:15,350 --> 00:02:16,760
I could run average.

42
00:02:16,760 --> 00:02:21,260
And let's say I've forgotten to convert
some input from the user to a number.

43
00:02:21,260 --> 00:02:24,080
And I instead have now a vector
of characters or character

44
00:02:24,080 --> 00:02:26,520
representations of these numbers here.

45
00:02:26,520 --> 00:02:30,043
So I'll pass as input
this vector 1, 2, and 3.

46
00:02:30,043 --> 00:02:31,460
But those are not numbers, per se.

47
00:02:31,460 --> 00:02:33,270
They're actually characters here.

48
00:02:33,270 --> 00:02:34,520
I'll go ahead and run average.

49
00:02:34,520 --> 00:02:37,305
And now, I'll see this error.

50
00:02:37,305 --> 00:02:39,680
This is probably not the first
time you've seen an error.

51
00:02:39,680 --> 00:02:41,690
Probably when you're programming,
you've seen lots and lots of errors.

52
00:02:41,690 --> 00:02:44,520
But let's give these
errors a more formal name.

53
00:02:44,520 --> 00:02:47,655
So these errors are more
formally called exceptions.

54
00:02:47,655 --> 00:02:50,780
And an exception occurs when something
exceptional happens in your program,

55
00:02:50,780 --> 00:02:52,640
but not in a good way.

56
00:02:52,640 --> 00:02:56,280
It happens when our program encounters
some situation, some scenario

57
00:02:56,280 --> 00:02:57,530
it doesn't know how to handle.

58
00:02:57,530 --> 00:03:00,200
And instead, it stops entirely.

59
00:03:00,200 --> 00:03:04,640
So a question then becomes how could we
handle these exceptions or these errors

60
00:03:04,640 --> 00:03:05,690
in our code?

61
00:03:05,690 --> 00:03:09,200
And one way to do so is to
handle them more proactively.

62
00:03:09,200 --> 00:03:10,010
Preempt them.

63
00:03:10,010 --> 00:03:12,560
And do something else
instead of encountering

64
00:03:12,560 --> 00:03:14,510
this error or this exception.

65
00:03:14,510 --> 00:03:18,080
So let's see if we could take that
approach now in our own function here

66
00:03:18,080 --> 00:03:19,130
called average.

67
00:03:19,130 --> 00:03:21,050
I'll come back now to RStudio.

68
00:03:21,050 --> 00:03:24,500
And let's think through what
exactly caused this exception.

69
00:03:24,500 --> 00:03:28,940
Well, if I look at it here, I'll see
that I gave the sum function, it seems,

70
00:03:28,940 --> 00:03:32,060
some invalid type character of argument.

71
00:03:32,060 --> 00:03:34,190
So it seems like the
problem was, in fact,

72
00:03:34,190 --> 00:03:38,880
that I gave as input to the average
function this vector of characters.

73
00:03:38,880 --> 00:03:42,320
So what could I do to check
for this before I maybe pass

74
00:03:42,320 --> 00:03:45,170
this input down into sum and length?

75
00:03:45,170 --> 00:03:48,680
I could probably use something like
a conditional to ask some question.

76
00:03:48,680 --> 00:03:50,630
But what question would I ask?

77
00:03:50,630 --> 00:03:55,372
Well, I probably could ask is
this vector numeric or is it not?

78
00:03:55,372 --> 00:03:57,830
And maybe I would consider the
case where it isn't numeric,

79
00:03:57,830 --> 00:04:00,538
where I might get an exception to
handle that case in particular.

80
00:04:00,538 --> 00:04:05,120
So here before I run line 3 now,
trying to sum up these numbers,

81
00:04:05,120 --> 00:04:07,340
and finding their length,
and dividing therein,

82
00:04:07,340 --> 00:04:09,830
why don't I go ahead and
try to ask the question?

83
00:04:09,830 --> 00:04:14,990
Is, let's say, this vector
x, is it not numeric?

84
00:04:14,990 --> 00:04:16,170
Just like this.

85
00:04:16,170 --> 00:04:20,240
So I'm going to make use of
now this function, is.numeric,

86
00:04:20,240 --> 00:04:24,980
which asks the question, returns me
true or false, is x a vector of numbers

87
00:04:24,980 --> 00:04:26,280
or is it not?

88
00:04:26,280 --> 00:04:29,030
And when I use this exclamation
point here, I'm essentially asking

89
00:04:29,030 --> 00:04:32,780
is the vector x not full of numbers?

90
00:04:32,780 --> 00:04:36,230
And now I have the option here
of handling that error before it

91
00:04:36,230 --> 00:04:39,230
might happen down here on line 5.

92
00:04:39,230 --> 00:04:42,080
So what could I do to handle this error?

93
00:04:42,080 --> 00:04:44,600
Well, a convention
sometimes in the R world

94
00:04:44,600 --> 00:04:48,110
is to return a special
value, one like NA.

95
00:04:48,110 --> 00:04:51,290
So if we give as input
to our function average

96
00:04:51,290 --> 00:04:55,670
a vector that doesn't include numbers,
I could say no, no, let's stop here

97
00:04:55,670 --> 00:04:59,480
and just return NA, instead of
getting this error ultimately.

98
00:04:59,480 --> 00:05:01,670
So I'll go ahead and
do just that on line 3.

99
00:05:01,670 --> 00:05:06,500
I'll say if we find that this vector x
is not full of numbers, is not numeric,

100
00:05:06,500 --> 00:05:09,600
I'll go ahead and return NA instead.

101
00:05:09,600 --> 00:05:14,210
And hopefully I'll now avoid this error
by kind of preempting it and handling

102
00:05:14,210 --> 00:05:15,680
it up above.

103
00:05:15,680 --> 00:05:18,320
Let me go ahead and redefine
my average function now

104
00:05:18,320 --> 00:05:20,637
to update what it has included here.

105
00:05:20,637 --> 00:05:22,720
I'll go ahead and run the
same thing I did before,

106
00:05:22,720 --> 00:05:25,970
giving as input to average
this vector of characters.

107
00:05:25,970 --> 00:05:30,250
And I'll see I'll get back
now just NA and no error.

108
00:05:30,250 --> 00:05:32,530
Now, we've handled it,
preempted it before it has

109
00:05:32,530 --> 00:05:35,530
had the chance to arise in this case.

110
00:05:35,530 --> 00:05:39,700
But if we're going to do
something a little unexpected

111
00:05:39,700 --> 00:05:43,270
here, like return NA when the user might
have thought they were getting back

112
00:05:43,270 --> 00:05:47,440
a number, it's worth thinking about
how to alert the user to that fact.

113
00:05:47,440 --> 00:05:50,980
Right now, we're handling this
error silently, if you will.

114
00:05:50,980 --> 00:05:53,230
Meaning we're not going to
raise anything to the user.

115
00:05:53,230 --> 00:05:55,255
We're going to hand them back NA.

116
00:05:55,255 --> 00:05:57,130
Unless they looked at
the return value, well,

117
00:05:57,130 --> 00:05:59,710
they wouldn't know anything
in particular was wrong

118
00:05:59,710 --> 00:06:01,468
or that they had done anything wrong.

119
00:06:01,468 --> 00:06:03,760
So let's think through how
we could alert the user here

120
00:06:03,760 --> 00:06:07,180
and let them know what it
is exactly we're doing here.

121
00:06:07,180 --> 00:06:11,530
Now, one way to do that is to make
use of this function built into R

122
00:06:11,530 --> 00:06:12,940
called message.

123
00:06:12,940 --> 00:06:16,270
Message allows you to essentially
send a message to the console

124
00:06:16,270 --> 00:06:17,890
while a function is running.

125
00:06:17,890 --> 00:06:19,650
So let's see if we
could use message here.

126
00:06:19,650 --> 00:06:21,560
I'll go back now to my function.

127
00:06:21,560 --> 00:06:25,220
And maybe before I return
NA, I could let the user

128
00:06:25,220 --> 00:06:27,900
know what it is I'm about to do.

129
00:06:27,900 --> 00:06:31,370
I could decide to send them a
message using the message function.

130
00:06:31,370 --> 00:06:34,250
And it turns out that as input
to this message function,

131
00:06:34,250 --> 00:06:36,530
I can provide the
character string showing

132
00:06:36,530 --> 00:06:39,260
the message I want to tell the user.

133
00:06:39,260 --> 00:06:42,050
I want to tell them what it is
I'm doing and probably tell them

134
00:06:42,050 --> 00:06:43,160
why I'm doing it.

135
00:06:43,160 --> 00:06:47,690
So first, I'll say that maybe this
input x here, our vector, I'll

136
00:06:47,690 --> 00:06:53,180
say that x, this x here
must be a numeric vector.

137
00:06:53,180 --> 00:06:56,060
So this is the cause
of why I'm returning

138
00:06:56,060 --> 00:06:58,640
NA not, let's say, the actual average.

139
00:06:58,640 --> 00:07:00,770
And now, I could say
what I'm doing instead.

140
00:07:00,770 --> 00:07:04,400
I'm going to return NA instead.

141
00:07:04,400 --> 00:07:07,190
Now, if I were to run
this function, I need

142
00:07:07,190 --> 00:07:11,360
to first redefine it, go back down to
my console, and provide the same input.

143
00:07:11,360 --> 00:07:13,140
And now, let's see what happens.

144
00:07:13,140 --> 00:07:14,450
I'll see that message.

145
00:07:14,450 --> 00:07:16,820
So now, we're not being silent anymore.

146
00:07:16,820 --> 00:07:20,555
The user who's run this function, they
would get back NA as a return value.

147
00:07:20,555 --> 00:07:21,680
But now they would know it.

148
00:07:21,680 --> 00:07:26,285
They would say-- it would say x must be
a numeric vector returning NA instead.

149
00:07:26,285 --> 00:07:28,160
So we've kind of gone
away from being silent.

150
00:07:28,160 --> 00:07:31,920
And now, the user knows exactly what has
gone wrong, perhaps, in this function.

151
00:07:31,920 --> 00:07:34,170
So a little more intuitive now.

152
00:07:34,170 --> 00:07:37,310
But it turns out that,
by convention, message

153
00:07:37,310 --> 00:07:40,130
is often used when things
are going just smoothly.

154
00:07:40,130 --> 00:07:42,620
We're trying to tell the
user exactly what's going on.

155
00:07:42,620 --> 00:07:44,875
It kind of tells them a bit
of a progress indicator.

156
00:07:44,875 --> 00:07:47,000
Gives them an idea of what
their function is doing.

157
00:07:47,000 --> 00:07:50,630
It's meant to be used in cases
where something has not gone wrong.

158
00:07:50,630 --> 00:07:54,530
But I'd argue, in this case,
something has gone wrong.

159
00:07:54,530 --> 00:07:59,000
We gave as input to the average function
an input it should not have been given.

160
00:07:59,000 --> 00:08:03,020
So there are ways to take a message
and to escalate it, if you will,

161
00:08:03,020 --> 00:08:03,598
in severity.

162
00:08:03,598 --> 00:08:06,140
To let the user know that
actually, something has gone wrong.

163
00:08:06,140 --> 00:08:08,540
There might be a potential issue here.

164
00:08:08,540 --> 00:08:10,850
Now, if we were to
escalate this message,

165
00:08:10,850 --> 00:08:14,600
we could instead convert it
into something called a warning.

166
00:08:14,600 --> 00:08:17,120
Now a warning is good when
your function encounters

167
00:08:17,120 --> 00:08:18,950
something that is a potential issue.

168
00:08:18,950 --> 00:08:21,260
It's a bit similar to
if you've driven a car

169
00:08:21,260 --> 00:08:23,032
and your check engine light pops up.

170
00:08:23,032 --> 00:08:26,240
Whether you're driving that car, you'll
know that, well, your car could still

171
00:08:26,240 --> 00:08:27,440
continue running.

172
00:08:27,440 --> 00:08:28,610
But you might want to
check under the hood

173
00:08:28,610 --> 00:08:30,860
and make sure everything is
going as you expect it to.

174
00:08:30,860 --> 00:08:33,440
So a warning tells a user
that something has gone wrong

175
00:08:33,440 --> 00:08:35,302
that could be a potential issue.

176
00:08:35,302 --> 00:08:37,760
And I think this is more in
line with what's happened here.

177
00:08:37,760 --> 00:08:41,299
The user has given us a value that
they really shouldn't have given us.

178
00:08:41,299 --> 00:08:43,640
So let's, instead of
messaging them, warn them.

179
00:08:43,640 --> 00:08:45,590
And tell them that, look, you
should not have done this.

180
00:08:45,590 --> 00:08:48,132
You should make sure this is
exactly what you want in the end

181
00:08:48,132 --> 00:08:49,590
as far as the return value.

182
00:08:49,590 --> 00:08:54,427
So I'll convert message here to a
warning instead, just like this.

183
00:08:54,427 --> 00:08:57,260
And now that means that the user
will not get a regular old message.

184
00:08:57,260 --> 00:09:01,220
They'll get a warning
indicating some potential issue.

185
00:09:01,220 --> 00:09:02,660
I'll go ahead and back to line 1.

186
00:09:02,660 --> 00:09:04,370
And I'll redefine this function.

187
00:09:04,370 --> 00:09:08,000
And now what will happen if I run
it again with that same input,

188
00:09:08,000 --> 00:09:12,050
I'll see I get not a message,
but a warning message.

189
00:09:12,050 --> 00:09:15,740
In this case, I see warning
message in my function,

190
00:09:15,740 --> 00:09:19,160
and its input here, the exact
message I typed on line 3,

191
00:09:19,160 --> 00:09:23,022
x must be a numeric vector,
returning NA instead.

192
00:09:23,022 --> 00:09:24,980
So this is a way of
alerting the user that they

193
00:09:24,980 --> 00:09:28,610
might get some value they
didn't expect because there

194
00:09:28,610 --> 00:09:31,040
was a potential issue,
which is they didn't give us

195
00:09:31,040 --> 00:09:33,830
an actual numeric vector.

196
00:09:33,830 --> 00:09:38,075
So a warning then is good for some
potential issue in your function

197
00:09:38,075 --> 00:09:40,070
that it could still recover from.

198
00:09:40,070 --> 00:09:42,290
We could still return NA here.

199
00:09:42,290 --> 00:09:46,430
But there is one more level of
severity, going from a warning

200
00:09:46,430 --> 00:09:48,440
to a full-fledged error.

201
00:09:48,440 --> 00:09:51,800
I think we could here have a discussion
of whether a warning or an error

202
00:09:51,800 --> 00:09:53,660
is best for this scenario.

203
00:09:53,660 --> 00:09:56,960
On the one hand, I could argue
that this function average

204
00:09:56,960 --> 00:10:01,340
is supposed to fundamentally take a
vector of numbers and return to me

205
00:10:01,340 --> 00:10:02,150
a number.

206
00:10:02,150 --> 00:10:06,470
If I haven't done that, my function
cannot accomplish its goal at all.

207
00:10:06,470 --> 00:10:11,060
In that case, I might not want to
just warn the user and return NA,

208
00:10:11,060 --> 00:10:14,547
I might want to just stop entirely and
say, look, you've given me an input.

209
00:10:14,547 --> 00:10:16,130
And I have no idea what to do with it.

210
00:10:16,130 --> 00:10:17,390
I can't handle it at all.

211
00:10:17,390 --> 00:10:19,710
I'm going to stop my
function in its entirety.

212
00:10:19,710 --> 00:10:23,130
So let's see what it would look like
if we actually not just warn the user,

213
00:10:23,130 --> 00:10:25,150
but stop the function entirely.

214
00:10:25,150 --> 00:10:27,210
Now, it just so just
so turns out that R has

215
00:10:27,210 --> 00:10:33,300
this function called stop that allows
us to raise or to throw this error.

216
00:10:33,300 --> 00:10:38,850
So let's upgrade now our warning
to full-fledged error using stop,

217
00:10:38,850 --> 00:10:42,540
letting the user know that we simply
cannot proceed with the input they have

218
00:10:42,540 --> 00:10:43,350
given us.

219
00:10:43,350 --> 00:10:45,330
I'll go back now to my average function.

220
00:10:45,330 --> 00:10:50,370
And let me go ahead and use stop,
much like I used message and warning,

221
00:10:50,370 --> 00:10:53,460
I'll give it an error
message in this case.

222
00:10:53,460 --> 00:10:56,730
But now what I should
do is not return NA.

223
00:10:56,730 --> 00:10:59,370
In fact, if I were to run
this code, I would never

224
00:10:59,370 --> 00:11:04,710
get to line 4 because as stop implies,
my function will stop on line 3.

225
00:11:04,710 --> 00:11:05,910
It will not continue.

226
00:11:05,910 --> 00:11:09,270
It will not return any
kind of value in this case.

227
00:11:09,270 --> 00:11:12,300
Why don't I go ahead
and remove line 4 now?

228
00:11:12,300 --> 00:11:14,010
And now, what will happen is this.

229
00:11:14,010 --> 00:11:16,800
We're going to ask the
question, is this input numeric?

230
00:11:16,800 --> 00:11:17,700
Or is it not?

231
00:11:17,700 --> 00:11:21,100
If it's not numeric, well, I
will throw or raise this error

232
00:11:21,100 --> 00:11:23,170
that the user will now see.

233
00:11:23,170 --> 00:11:26,230
Let me go ahead and run or
redefine this average function.

234
00:11:26,230 --> 00:11:28,480
Pass as input, the same thing
we've been doing so far.

235
00:11:28,480 --> 00:11:31,420
And now, I'll see, well, an error.

236
00:11:31,420 --> 00:11:34,400
And we're kind of back where we
started, giving an error now.

237
00:11:34,400 --> 00:11:36,010
But this one is more precise.

238
00:11:36,010 --> 00:11:38,500
It's one that we've raised
or thrown ourselves.

239
00:11:38,500 --> 00:11:41,560
And it tells us exactly what has
happened and why it has happened.

240
00:11:41,560 --> 00:11:45,760
Here we see error in our
function average. x, the input,

241
00:11:45,760 --> 00:11:49,172
must be a numeric vector returning--
oops-- returning NA instead.

242
00:11:49,172 --> 00:11:51,130
And actually, that's
probably not true anymore.

243
00:11:51,130 --> 00:11:54,460
So this stop on line 3 doesn't
seem to return us anything,

244
00:11:54,460 --> 00:11:55,482
at least not NA now.

245
00:11:55,482 --> 00:11:57,940
So let's go ahead and go ahead
and remove that message here

246
00:11:57,940 --> 00:12:00,970
to make sure that the user
doesn't anticipate an NA.

247
00:12:00,970 --> 00:12:04,930
Why don't we just say x
must be a numeric vector?

248
00:12:04,930 --> 00:12:06,500
I'll go ahead and redefine it.

249
00:12:06,500 --> 00:12:10,210
And now rerun it, and we
should see exactly the error

250
00:12:10,210 --> 00:12:12,350
we were hoping to see here.

251
00:12:12,350 --> 00:12:15,700
So we've seen now how to
talk to the user and message

252
00:12:15,700 --> 00:12:19,030
them about these kinds of potential
issues in their functions.

253
00:12:19,030 --> 00:12:20,070
We've seen message.

254
00:12:20,070 --> 00:12:21,180
We've seen warning.

255
00:12:21,180 --> 00:12:22,440
We've seen stop.

256
00:12:22,440 --> 00:12:27,030
Let me ask now what questions we have
about any of these functions so far,

257
00:12:27,030 --> 00:12:30,060
and how we convey or communicate
about these errors that

258
00:12:30,060 --> 00:12:32,880
could happen in our functions.

259
00:12:32,880 --> 00:12:36,930
AUDIENCE: What's the difference between
using message versus print versus cat

260
00:12:36,930 --> 00:12:38,490
to display an error message?

261
00:12:38,490 --> 00:12:39,865
CARTER ZENKE: So a good question.

262
00:12:39,865 --> 00:12:44,070
We've seen so far these functions
like print, and cat, and now this one

263
00:12:44,070 --> 00:12:45,000
called message.

264
00:12:45,000 --> 00:12:47,790
They all seem to show us
some text in the console.

265
00:12:47,790 --> 00:12:50,700
Well, a message is a more
special kind of text output

266
00:12:50,700 --> 00:12:53,010
that we could later
on choose to suppress.

267
00:12:53,010 --> 00:12:56,100
So you've probably seen
so far, suppress warnings.

268
00:12:56,100 --> 00:12:58,620
There might also be a function
called suppress message

269
00:12:58,620 --> 00:13:01,590
you could use to hide those
messages as they come up.

270
00:13:01,590 --> 00:13:04,650
There is no such feature though
for print row or for cat.

271
00:13:04,650 --> 00:13:07,380
A message is more particular
and exclusive to showing

272
00:13:07,380 --> 00:13:11,940
the user a message they could either
view or decline to view later on.

273
00:13:11,940 --> 00:13:13,600
Good question.

274
00:13:13,600 --> 00:13:15,750
OK, so let's consider
other scenarios here

275
00:13:15,750 --> 00:13:18,240
that we could try to
address in our function.

276
00:13:18,240 --> 00:13:20,590
We've considered so
far what happens if we

277
00:13:20,590 --> 00:13:22,900
don't get the type of
input we're expecting,

278
00:13:22,900 --> 00:13:25,390
in this case, a non-numeric input.

279
00:13:25,390 --> 00:13:29,080
But there are other scenarios we should
probably consider and anticipate.

280
00:13:29,080 --> 00:13:33,700
And one of them might be
if our input has NAs in it.

281
00:13:33,700 --> 00:13:36,340
So we've seen that the
mean function, if it's

282
00:13:36,340 --> 00:13:41,360
given some input that has NA,
well, it returns to us NA instead.

283
00:13:41,360 --> 00:13:44,650
So if we want our function
to do the very same thing,

284
00:13:44,650 --> 00:13:46,480
maybe we could have a check here.

285
00:13:46,480 --> 00:13:49,750
Maybe after I check to see
if the input is numeric,

286
00:13:49,750 --> 00:13:51,730
I could ask another question.

287
00:13:51,730 --> 00:13:54,940
I could ask this one
here, if any, let's say,

288
00:13:54,940 --> 00:14:00,730
if any of the values in
this x vector are NA,

289
00:14:00,730 --> 00:14:04,270
just like this, why don't we
go ahead and do something else

290
00:14:04,270 --> 00:14:07,240
before we run, now, line 8?

291
00:14:07,240 --> 00:14:12,670
But what is it I should do if
any of these numbers are NA?

292
00:14:12,670 --> 00:14:15,890
Well, I could, of course, return
NA, like we decided to do earlier.

293
00:14:15,890 --> 00:14:17,620
But now there's a question here.

294
00:14:17,620 --> 00:14:20,740
I don't want to silently return NA.

295
00:14:20,740 --> 00:14:22,480
And I have three options.

296
00:14:22,480 --> 00:14:24,190
I could either message the user.

297
00:14:24,190 --> 00:14:25,630
I could warn them.

298
00:14:25,630 --> 00:14:28,480
Or I could throw an error using stop.

299
00:14:28,480 --> 00:14:30,160
Let me actually ask our audience here.

300
00:14:30,160 --> 00:14:31,420
What would you use?

301
00:14:31,420 --> 00:14:35,800
Would you use message, or
warning, or stop in this case?

302
00:14:35,800 --> 00:14:40,120
How might you try to handle
this particular input?

303
00:14:40,120 --> 00:14:44,650
AUDIENCE: I think it because is that--
if something or an error happened,

304
00:14:44,650 --> 00:14:45,730
it will--

305
00:14:45,730 --> 00:14:47,022
the user [INAUDIBLE] the error.

306
00:14:47,022 --> 00:14:48,147
CARTER ZENKE: A good point.

307
00:14:48,147 --> 00:14:51,070
So we've seen that message is good
for just conveying information,

308
00:14:51,070 --> 00:14:53,140
when there's nothing
really wrong going on.

309
00:14:53,140 --> 00:14:55,930
Whereas, a warning is good
to mention a potential issue

310
00:14:55,930 --> 00:14:57,790
the users should take a closer look at.

311
00:14:57,790 --> 00:15:01,300
And I'd argue that in this case,
a warning is probably better.

312
00:15:01,300 --> 00:15:03,370
We're doing something
a little unexpected.

313
00:15:03,370 --> 00:15:06,910
We're returning NA, as opposed to in
this case, the average the user might

314
00:15:06,910 --> 00:15:09,130
have been expecting,
so I think a warning

315
00:15:09,130 --> 00:15:11,088
would be best here to
alert the user that there

316
00:15:11,088 --> 00:15:12,800
might be some potential issue here.

317
00:15:12,800 --> 00:15:14,110
I'll come back to RStudio.

318
00:15:14,110 --> 00:15:16,870
And let's go ahead and actually
implement now this warning.

319
00:15:16,870 --> 00:15:20,840
I'll do it the same way I did before
with warning here before I return NA.

320
00:15:20,840 --> 00:15:21,860
I'll give a warning.

321
00:15:21,860 --> 00:15:27,800
And in this case, I'll say as follows,
that x, the input now, contains

322
00:15:27,800 --> 00:15:32,130
one or more NA values, just like this.

323
00:15:32,130 --> 00:15:35,060
So now, my function is
looking a little bit better

324
00:15:35,060 --> 00:15:38,810
at handling these kinds of cases
I might not have anticipated.

325
00:15:38,810 --> 00:15:42,800
First, if we get a non-numeric input,
we're going to go ahead and stop.

326
00:15:42,800 --> 00:15:46,970
We fundamentally cannot continue
with a non-numeric input.

327
00:15:46,970 --> 00:15:51,350
Then we're going to ask the question,
if any of the values in our vector x

328
00:15:51,350 --> 00:15:54,380
are NA, we're going to
warn the user and tell them

329
00:15:54,380 --> 00:15:56,532
that x contains one or more NA values.

330
00:15:56,532 --> 00:15:57,740
They might not know that yet.

331
00:15:57,740 --> 00:16:01,820
And we're going to ourselves
return NA by convention here.

332
00:16:01,820 --> 00:16:03,880
If though, none of these
conditions are true,

333
00:16:03,880 --> 00:16:05,630
we're going to go down
to the bottom here.

334
00:16:05,630 --> 00:16:09,890
And we're going to return the
average, just as we would otherwise.

335
00:16:09,890 --> 00:16:14,060
So let me go ahead and run this
definition for the average function.

336
00:16:14,060 --> 00:16:18,020
I'll go ahead and give it, let's say,
some faulty input, like we saw before.

337
00:16:18,020 --> 00:16:20,070
I'll get the error, like we see.

338
00:16:20,070 --> 00:16:23,340
Why don't I try now giving
it an input with some NAs

339
00:16:23,340 --> 00:16:28,980
I'll give it 1, the number, 2
the number, and 3, NA, let's see.

340
00:16:28,980 --> 00:16:32,760
Now, I get NA back and this
warning message down below.

341
00:16:32,760 --> 00:16:35,430
Well, let's try a normal
input average, and I'll

342
00:16:35,430 --> 00:16:37,710
give it 1, 2, and 3, all numbers.

343
00:16:37,710 --> 00:16:41,050
Here we'll see the average
of those numbers, 2.

344
00:16:41,050 --> 00:16:43,290
So I think our function now
is much better designed.

345
00:16:43,290 --> 00:16:46,570
We're able to handle these edge cases
that the user might have given us.

346
00:16:46,570 --> 00:16:48,570
And now, we can alert
them to exactly what we're

347
00:16:48,570 --> 00:16:51,540
going to do to handle those cases.

348
00:16:51,540 --> 00:16:54,810
Now, when we come back, we'll
see how to actually test

349
00:16:54,810 --> 00:16:56,940
this program this function
in particular and make

350
00:16:56,940 --> 00:16:58,890
sure it's behaving like we intend.

351
00:16:58,890 --> 00:17:02,670
We'll come back in five and see
how to test programs like these.

352
00:17:02,670 --> 00:17:03,990
Well, we're back.

353
00:17:03,990 --> 00:17:07,680
And we've seen so far how to preempt
a few potential errors in functions

354
00:17:07,680 --> 00:17:08,430
we've written.

355
00:17:08,430 --> 00:17:12,960
What's next is to actually test our code
and make sure it behaves as we intend.

356
00:17:12,960 --> 00:17:16,530
And we'll do so by writing
what we'll call unit tests.

357
00:17:16,530 --> 00:17:19,650
Now, a unit test is some
code that we write ourselves

358
00:17:19,650 --> 00:17:22,440
to test some unit of our program.

359
00:17:22,440 --> 00:17:24,390
But what are those units?

360
00:17:24,390 --> 00:17:26,700
Well, functions-- or sorry--
programs are composed

361
00:17:26,700 --> 00:17:28,830
of individual units called functions.

362
00:17:28,830 --> 00:17:34,290
So unit tests now are code we can write
to test individual functions inside

363
00:17:34,290 --> 00:17:35,208
of our programs.

364
00:17:35,208 --> 00:17:37,500
And we'll go ahead and write
some unit tests of our own

365
00:17:37,500 --> 00:17:39,882
by now testing our average function.

366
00:17:39,882 --> 00:17:41,340
So let's go ahead and do just that.

367
00:17:41,340 --> 00:17:42,690
I'll go back to RStudio.

368
00:17:42,690 --> 00:17:46,440
And I will, by convention, to
test this function average,

369
00:17:46,440 --> 00:17:52,080
create a new file called test-average.R.
So I will go ahead down here.

370
00:17:52,080 --> 00:17:56,880
And I'll say, I want to create a
new file called test-average.R.

371
00:17:56,880 --> 00:17:58,920
And I'll see that file
was created for me.

372
00:17:58,920 --> 00:18:04,050
If I now go to my File Explorer over
here, I can open up test-average.

373
00:18:04,050 --> 00:18:06,810
And I'll see a blank
page, in which I can write

374
00:18:06,810 --> 00:18:09,870
my tests for this average function.

375
00:18:09,870 --> 00:18:13,830
Well again, by convention, what
I'll do is write a function

376
00:18:13,830 --> 00:18:18,030
to test this code that I've written
now in average.R, in particular,

377
00:18:18,030 --> 00:18:19,590
this function average.

378
00:18:19,590 --> 00:18:23,740
I can call it test_average,
just like this.

379
00:18:23,740 --> 00:18:25,810
And I'll make sure it is a function.

380
00:18:25,810 --> 00:18:27,520
Doesn't take any inputs for now.

381
00:18:27,520 --> 00:18:30,600
But within this test
here that I've written,

382
00:18:30,600 --> 00:18:32,790
this function I can use
to test my code, I'll

383
00:18:32,790 --> 00:18:35,910
then provide one or more test cases.

384
00:18:35,910 --> 00:18:40,830
Now, a test case is some representative
scenario our function might encounter.

385
00:18:40,830 --> 00:18:43,350
And we want to ask the
question, did our function

386
00:18:43,350 --> 00:18:47,250
return the right value for
this particular scenario?

387
00:18:47,250 --> 00:18:50,400
So if I want to ask a question, I
could do that using a conditional,

388
00:18:50,400 --> 00:18:51,000
as we've seen.

389
00:18:51,000 --> 00:18:52,530
So maybe I'll ask the question here.

390
00:18:52,530 --> 00:18:57,630
If, let's say, average, let me call
our function, and give it as input

391
00:18:57,630 --> 00:19:01,980
this vector we saw earlier,
1, 2, and 3, all numbers.

392
00:19:01,980 --> 00:19:07,170
If the return value of average,
given this input, is equal to 2,

393
00:19:07,170 --> 00:19:08,520
well what should I say?

394
00:19:08,520 --> 00:19:11,430
I could probably say, in
this case, that average,

395
00:19:11,430 --> 00:19:14,280
my function average passed the test.

396
00:19:14,280 --> 00:19:16,560
And I'll give it a little
smiley face, just for fun.

397
00:19:16,560 --> 00:19:19,620
Otherwise, though, if I
don't get that value back,

398
00:19:19,620 --> 00:19:22,860
what should I show to the user or
to myself here, the programmer?

399
00:19:22,860 --> 00:19:26,760
I could probably say something
like, well, average failed the test.

400
00:19:26,760 --> 00:19:29,920
And give me a little sad face, to
make sure I know what's going on.

401
00:19:29,920 --> 00:19:34,920
So this is my first test
for this function average.

402
00:19:34,920 --> 00:19:37,410
Notice how I've defined one test case.

403
00:19:37,410 --> 00:19:42,480
My test case is when I give average
the input 1, 2, and 3 as a list,

404
00:19:42,480 --> 00:19:45,420
I then expect that I'll
get back the value 2.

405
00:19:45,420 --> 00:19:47,970
And if I do, I'll say the
average passed the test.

406
00:19:47,970 --> 00:19:50,333
If not, I'll go ahead and
say we failed the test.

407
00:19:50,333 --> 00:19:52,250
And now, for cleanliness
here, let me go ahead

408
00:19:52,250 --> 00:19:56,100
and say backslash n to add a new
line to each of these messages here.

409
00:19:56,100 --> 00:20:00,450
And why don't we go ahead and try
to run this test_average function?

410
00:20:00,450 --> 00:20:03,240
Well, before we do so, we probably
want to know a few things.

411
00:20:03,240 --> 00:20:06,450
One is I've only defined the
test_average function here.

412
00:20:06,450 --> 00:20:09,180
I've defined it as having
this test case here.

413
00:20:09,180 --> 00:20:13,020
But if I want to run it, I should
still call this function, probably

414
00:20:13,020 --> 00:20:14,940
down at the bottom of my file down here.

415
00:20:14,940 --> 00:20:17,757
I'll say let's run test_average.

416
00:20:17,757 --> 00:20:20,840
And one thing you might notice if
you're being particularly observant here

417
00:20:20,840 --> 00:20:24,260
is that I'm calling
the average function.

418
00:20:24,260 --> 00:20:28,070
But at least within this
file here, test-average.R,

419
00:20:28,070 --> 00:20:30,760
well I don't see average defined.

420
00:20:30,760 --> 00:20:33,920
What we don't want to do is this, I
don't want to go over to average.R,

421
00:20:33,920 --> 00:20:37,340
copy and paste this, and put
it over in test-average.R.

422
00:20:37,340 --> 00:20:42,560
What I can do more simply is
run source within this file.

423
00:20:42,560 --> 00:20:46,010
I could say source, and
then the name of the file

424
00:20:46,010 --> 00:20:50,930
I want to run before I run the rest
of the code now in this program here.

425
00:20:50,930 --> 00:20:55,220
I'm going to run now the code in
average.R, which will give me access

426
00:20:55,220 --> 00:20:56,840
to this average function.

427
00:20:56,840 --> 00:21:00,260
And I can then later on
call it in this file.

428
00:21:00,260 --> 00:21:04,880
So we're kind of, if you will, importing
this function into this file here.

429
00:21:04,880 --> 00:21:09,140
We're not-- we're now able to use any
function we've defined in average.R

430
00:21:09,140 --> 00:21:12,260
in this new file, test-average.R
because we've sourced it.

431
00:21:12,260 --> 00:21:15,990
We've run it before we've run
any code in this file here.

432
00:21:15,990 --> 00:21:20,100
So I think this is all we'll need
to test our average function.

433
00:21:20,100 --> 00:21:22,750
Why don't I go ahead
and run this test here?

434
00:21:22,750 --> 00:21:25,590
I'll go ahead and click on
source to run this file now.

435
00:21:25,590 --> 00:21:28,290
And we'll see, average passed the test.

436
00:21:28,290 --> 00:21:32,820
So it seems like in this case, if I
give my average function the input 1, 2,

437
00:21:32,820 --> 00:21:35,820
and 3, it will return to me the value 2.

438
00:21:35,820 --> 00:21:39,570
Well, what are some other
test cases we could think of?

439
00:21:39,570 --> 00:21:43,230
Like ideally, we'd think through some
representative cases that we should

440
00:21:43,230 --> 00:21:46,050
know how to handle, but
also ones that kind of

441
00:21:46,050 --> 00:21:48,210
cover a broad range of scenarios.

442
00:21:48,210 --> 00:21:51,150
Here, I've been testing
positive numbers.

443
00:21:51,150 --> 00:21:55,520
But it would be worthwhile to test if
average can work with negative numbers

444
00:21:55,520 --> 00:21:56,020
too.

445
00:21:56,020 --> 00:21:58,590
So let's add a new test
case, I might just kind of

446
00:21:58,590 --> 00:22:00,090
copy and paste this for now.

447
00:22:00,090 --> 00:22:03,570
I'll take my test case here,
and add a new one down below.

448
00:22:03,570 --> 00:22:07,140
And why don't I change now the
input to the average function?

449
00:22:07,140 --> 00:22:10,260
I'll give it now some negative
numbers, more representative examples

450
00:22:10,260 --> 00:22:15,150
here, negative 1, negative 2,
negative 2, and negative 3.

451
00:22:15,150 --> 00:22:16,560
And what should I get back?

452
00:22:16,560 --> 00:22:21,330
Well, negative 2 is the average of
negative 1, negative 2, and negative 3.

453
00:22:21,330 --> 00:22:23,340
That's what I should expect.

454
00:22:23,340 --> 00:22:27,150
Here, I've tested now both
positive and negative numbers.

455
00:22:27,150 --> 00:22:29,700
But it's probably
worth testing zero too,

456
00:22:29,700 --> 00:22:31,950
which is neither positive nor negative.

457
00:22:31,950 --> 00:22:35,610
I'm trying to think of scenarios
that might go beyond the usual cases

458
00:22:35,610 --> 00:22:38,520
but are still important for me to
be able to handle appropriately.

459
00:22:38,520 --> 00:22:40,680
So why don't I make a new test case?

460
00:22:40,680 --> 00:22:42,030
One that involves zero?

461
00:22:42,030 --> 00:22:43,800
I'll go down here and add a new one.

462
00:22:43,800 --> 00:22:49,530
Maybe I'll do negative 1 and 0
to use that number and then 1.

463
00:22:49,530 --> 00:22:52,410
So now we're going between
negative and positive,

464
00:22:52,410 --> 00:22:54,150
and neither negative or positive.

465
00:22:54,150 --> 00:22:57,780
And the average here
should be, well, zero.

466
00:22:57,780 --> 00:23:03,060
So here, I have three test
cases in this one test function.

467
00:23:03,060 --> 00:23:05,220
First, I'll test positive numbers.

468
00:23:05,220 --> 00:23:06,960
Then I'll test negative numbers.

469
00:23:06,960 --> 00:23:10,680
Then I'll test positive, negative,
and neither positive nor negative

470
00:23:10,680 --> 00:23:14,640
numbers, hoping for the
right output in each case.

471
00:23:14,640 --> 00:23:17,340
I'll still run my test
average function down below.

472
00:23:17,340 --> 00:23:19,980
Let me clear my console,
click on source.

473
00:23:19,980 --> 00:23:24,240
And now, I'll see average seems
to have passed all three tests.

474
00:23:24,240 --> 00:23:27,180
So my code seems doing pretty well here.

475
00:23:27,180 --> 00:23:31,800
But if we wanted to keep going and
adding more test cases to this,

476
00:23:31,800 --> 00:23:34,740
I'd argue that we'd get
pretty bored pretty quickly.

477
00:23:34,740 --> 00:23:36,570
And it would be a lot
a lot of copy/paste.

478
00:23:36,570 --> 00:23:39,660
Like I've already written
here 21 lines of code

479
00:23:39,660 --> 00:23:42,660
to test a function that
was 10 lines of code.

480
00:23:42,660 --> 00:23:45,630
And if we wanted to test our
programs, and every time,

481
00:23:45,630 --> 00:23:50,310
had to write three, four times the
amount of code to test that function,

482
00:23:50,310 --> 00:23:52,540
well, nobody would test their code.

483
00:23:52,540 --> 00:23:54,340
And we want people to test their code.

484
00:23:54,340 --> 00:23:56,880
So thankfully, people who
are in the R community

485
00:23:56,880 --> 00:23:59,910
have developed their own
package to allow us to make

486
00:23:59,910 --> 00:24:03,090
testing easier, and arguably, more fun.

487
00:24:03,090 --> 00:24:06,270
So a package that is canonical in the
R community to test your programs,

488
00:24:06,270 --> 00:24:07,875
is called testthat.

489
00:24:07,875 --> 00:24:11,610
It allows you to test that your
function behaves as you might expect.

490
00:24:11,610 --> 00:24:16,530
So let's go ahead and use testthat
now to improve the design of our tests

491
00:24:16,530 --> 00:24:20,190
and make it easier to write
test cases like these.

492
00:24:20,190 --> 00:24:23,550
Now, testthat comes
with a function called

493
00:24:23,550 --> 00:24:28,380
test_that, which allows me to
make a new test for my code.

494
00:24:28,380 --> 00:24:31,078
But before I can use it, I, of
course, need to install testthat.

495
00:24:31,078 --> 00:24:33,870
So if you haven't already, let me
go down to your console down here

496
00:24:33,870 --> 00:24:38,280
and say install.package,
and install testthat.

497
00:24:38,280 --> 00:24:41,410
Once you've installed it,
you then need to load it.

498
00:24:41,410 --> 00:24:44,280
So I'll go ahead and load
testthat, just like this,

499
00:24:44,280 --> 00:24:47,015
by doing library followed by testthat.

500
00:24:47,015 --> 00:24:48,390
Now I'll go ahead and Enter here.

501
00:24:48,390 --> 00:24:51,330
And I'll see that I've
now loaded testthat.

502
00:24:51,330 --> 00:24:55,240
I now have access to
functions like test_that.

503
00:24:55,240 --> 00:24:58,848
So I think what I've written
here so far is pretty good.

504
00:24:58,848 --> 00:25:00,390
It at least has some good test cases.

505
00:25:00,390 --> 00:25:03,147
But I don't need any of this
to use test that anymore.

506
00:25:03,147 --> 00:25:04,980
I'm going to go ahead
and delete most of it,

507
00:25:04,980 --> 00:25:09,390
but still include now my
average.R import, if you will.

508
00:25:09,390 --> 00:25:12,790
I'm taking whatever I've written an
average.R and making it able to make

509
00:25:12,790 --> 00:25:13,290
me--

510
00:25:13,290 --> 00:25:17,160
making myself able to use it
now and test-average.R. Now,

511
00:25:17,160 --> 00:25:22,150
we said before that testthat comes
with a function called test_that.

512
00:25:22,150 --> 00:25:27,520
And we use this function to define a
new test for some function that we have.

513
00:25:27,520 --> 00:25:30,580
So I'll go ahead and go
back to test-average.R.

514
00:25:30,580 --> 00:25:34,030
And I'll go ahead and use test_that.

515
00:25:34,030 --> 00:25:39,110
And the first input to testthat is a
description of the test I want to run.

516
00:25:39,110 --> 00:25:41,770
So here I'll say I want to test that--

517
00:25:41,770 --> 00:25:43,270
I want to test that--

518
00:25:43,270 --> 00:25:47,350
oops-- that average, let's
say, the average function here,

519
00:25:47,350 --> 00:25:51,290
calculates the mean, or in this
case, the average of these numbers.

520
00:25:51,290 --> 00:25:54,070
So this is kind of an
English sentence now.

521
00:25:54,070 --> 00:25:58,100
I'm going to test that
average calculates mean.

522
00:25:58,100 --> 00:26:01,240
Well, the next argument
is the set of test cases

523
00:26:01,240 --> 00:26:05,680
I want to run to ensure that testthat--
or to ensure that average calculates

524
00:26:05,680 --> 00:26:07,510
the mean appropriately.

525
00:26:07,510 --> 00:26:12,760
By convention, I'll put these test
cases inside of these curly braces

526
00:26:12,760 --> 00:26:14,240
as a second argument now.

527
00:26:14,240 --> 00:26:19,010
And I can now provide several test
cases inside of this one function

528
00:26:19,010 --> 00:26:21,050
that I've decided to create here.

529
00:26:21,050 --> 00:26:23,030
Now, how could I say--

530
00:26:23,030 --> 00:26:24,860
or express a test case?

531
00:26:24,860 --> 00:26:28,370
Well, testthat comes with
some functions we can use.

532
00:26:28,370 --> 00:26:30,950
And we really use them
by expecting-- or saying

533
00:26:30,950 --> 00:26:34,220
what we expect to happen when
our function returns some value.

534
00:26:34,220 --> 00:26:37,730
One of these functions
here is expect_equal.

535
00:26:37,730 --> 00:26:40,700
We could expect that
when our function is run,

536
00:26:40,700 --> 00:26:44,510
we should get back a return value
that is equal to some other value,

537
00:26:44,510 --> 00:26:47,040
much like we just did with
our conditionals earlier.

538
00:26:47,040 --> 00:26:49,010
But now, I'll use expect_equal.

539
00:26:49,010 --> 00:26:50,990
Let me go back now to my code.

540
00:26:50,990 --> 00:26:56,810
And inside of my test here, I'll go
ahead and define a few test cases.

541
00:26:56,810 --> 00:27:02,030
The first one will be I want to
expect equality between the return

542
00:27:02,030 --> 00:27:07,850
value of the average function when given
1, 2, and 3 as input, and this value 2

543
00:27:07,850 --> 00:27:09,360
on the right hand side.

544
00:27:09,360 --> 00:27:12,770
So to be clear here, the
first input to expect_equal

545
00:27:12,770 --> 00:27:15,950
is the argument, the
value we'll get back from,

546
00:27:15,950 --> 00:27:18,020
in this case, our average function.

547
00:27:18,020 --> 00:27:20,810
And the next argument
is the value we expect

548
00:27:20,810 --> 00:27:23,330
to find as the return value of average.

549
00:27:23,330 --> 00:27:25,790
I'm going to expect those are now equal.

550
00:27:25,790 --> 00:27:27,470
And this is our test case.

551
00:27:27,470 --> 00:27:29,527
There are no conditionals,
no nothing else.

552
00:27:29,527 --> 00:27:31,610
We're going to go ahead
and just use this to test,

553
00:27:31,610 --> 00:27:37,280
did average return to us 2 when we gave
it as input a vector of 1, 2, and 3?

554
00:27:37,280 --> 00:27:39,470
Well, let's now add
our other test cases.

555
00:27:39,470 --> 00:27:42,170
I could copy/paste this
and change the input.

556
00:27:42,170 --> 00:27:45,860
I'll do negative 1,
negative 2, and negative 3

557
00:27:45,860 --> 00:27:48,080
to test now for negative values.

558
00:27:48,080 --> 00:27:50,390
The expected value is negative 2 now.

559
00:27:50,390 --> 00:27:51,570
I'll do the same now.

560
00:27:51,570 --> 00:27:54,440
But for negative 1, 0, and 1.

561
00:27:54,440 --> 00:27:57,130
The expected value now
is going to be zero.

562
00:27:57,130 --> 00:27:59,630
And why don't we go ahead and
just add some more test cases?

563
00:27:59,630 --> 00:28:01,190
Now it's just so easy for us.

564
00:28:01,190 --> 00:28:06,410
One thing I could do is test maybe
more than an odd number of numbers.

565
00:28:06,410 --> 00:28:07,970
I've always been testing three here.

566
00:28:07,970 --> 00:28:10,580
Maybe I'll test four
as another scenario.

567
00:28:10,580 --> 00:28:15,620
I'll go ahead and do, let's say,
negative 2, negative 1, 1, and 2.

568
00:28:15,620 --> 00:28:18,950
So now we test both positive
and negative numbers.

569
00:28:18,950 --> 00:28:21,680
But now, we're giving an even
number of numbers as input.

570
00:28:21,680 --> 00:28:25,410
And we should get back, of
course, zero in the end.

571
00:28:25,410 --> 00:28:28,193
So this, then, is our test
of our average function.

572
00:28:28,193 --> 00:28:30,110
Let's go ahead and see
what could happen here.

573
00:28:30,110 --> 00:28:34,760
Notice how at the top of RStudio, I
now see a button called Run Tests.

574
00:28:34,760 --> 00:28:38,720
This means we're going to run
every test we see in this file.

575
00:28:38,720 --> 00:28:42,200
I could, alternatively though, go down
to the bottom of my console and just

576
00:28:42,200 --> 00:28:43,890
source this file to run it.

577
00:28:43,890 --> 00:28:48,635
I could say source test-average.R and
let's see what kind of output we get.

578
00:28:48,635 --> 00:28:49,760
I'll go ahead and run this.

579
00:28:49,760 --> 00:28:51,980
And oh, test passed.

580
00:28:51,980 --> 00:28:55,430
We have a little gold medal
here to say our function worked

581
00:28:55,430 --> 00:28:56,720
as we intended it to.

582
00:28:56,720 --> 00:29:00,680
Here, I can see that average will
return to me all these values for each

583
00:29:00,680 --> 00:29:03,350
of these test cases,
making it much easier

584
00:29:03,350 --> 00:29:07,940
now to write test cases like
these, thanks to testthat.

585
00:29:07,940 --> 00:29:12,800
Let me ask now, what questions do
we have on defining these test cases

586
00:29:12,800 --> 00:29:16,433
and using a package like testthat?

587
00:29:16,433 --> 00:29:18,350
AUDIENCE: Don't we have
to source average data

588
00:29:18,350 --> 00:29:23,120
because if you have a huge file, and
if you only want to test one function,

589
00:29:23,120 --> 00:29:25,880
then it won't be like a good
idea to source the entire file?

590
00:29:25,880 --> 00:29:27,130
CARTER ZENKE: A good question.

591
00:29:27,130 --> 00:29:30,560
So notice here how I actually ran the
source test-average.R because my goal

592
00:29:30,560 --> 00:29:33,320
was to run these tests
here top to bottom.

593
00:29:33,320 --> 00:29:37,820
Test-average.R already sources
or runs, if you will, average.R,

594
00:29:37,820 --> 00:29:42,710
giving me access to any functions
inside of that file here.

595
00:29:42,710 --> 00:29:45,650
When we come back another
time, next lecture,

596
00:29:45,650 --> 00:29:47,750
we'll see how to make
packages of our code.

597
00:29:47,750 --> 00:29:51,487
And we'll see how to write tests that
don't require us to put source up top.

598
00:29:51,487 --> 00:29:54,320
But so long as we're not running
packages and just testing our code,

599
00:29:54,320 --> 00:29:58,550
we're going to need to include source
average.R up top to give us access

600
00:29:58,550 --> 00:29:59,660
to average.

601
00:29:59,660 --> 00:30:02,900
So we can run it inside
this test file here.

602
00:30:02,900 --> 00:30:08,120
What other questions do we have on
testthat or testing our code so far?

603
00:30:08,120 --> 00:30:11,553
AUDIENCE: When is an
appropriate time to write tests?

604
00:30:11,553 --> 00:30:13,970
CARTER ZENKE: Yeah, when is
it appropriate to write tests?

605
00:30:13,970 --> 00:30:15,803
And what time is
appropriate to write tests?

606
00:30:15,803 --> 00:30:19,170
So there are-- so I'd say any
varying philosophies on this.

607
00:30:19,170 --> 00:30:22,440
There is a kind of a movement
or a philosophy called

608
00:30:22,440 --> 00:30:25,950
test-driven development, which
argues you should write tests

609
00:30:25,950 --> 00:30:27,840
before you even write your code.

610
00:30:27,840 --> 00:30:30,480
And by writing your tests,
you kind of get your mind

611
00:30:30,480 --> 00:30:32,190
around what you want your code to do.

612
00:30:32,190 --> 00:30:34,740
And then you write code
to pass those tests.

613
00:30:34,740 --> 00:30:36,810
On the other hand, folks
might say, well, I just

614
00:30:36,810 --> 00:30:39,870
want to get something done, I'll
write the code, and then I'll test it.

615
00:30:39,870 --> 00:30:42,210
There's arguments on
both sides to be made.

616
00:30:42,210 --> 00:30:45,270
It's going up to you and your team
to decide when you want to test

617
00:30:45,270 --> 00:30:46,620
and how you want to test.

618
00:30:46,620 --> 00:30:50,800
This is telling us how we could test
now using packages like testthat.

619
00:30:50,800 --> 00:30:54,970
But good question on
when to test as well.

620
00:30:54,970 --> 00:30:58,240
OK, so here, we've written
a pretty good test case.

621
00:30:58,240 --> 00:31:00,660
There are many test cases
here for average function.

622
00:31:00,660 --> 00:31:03,060
But there are still
other scenarios to test.

623
00:31:03,060 --> 00:31:06,870
And in particular, we saw what
could happen if we gave average

624
00:31:06,870 --> 00:31:09,840
some input that included NA values.

625
00:31:09,840 --> 00:31:12,990
Well, we could just as well
test the result of average

626
00:31:12,990 --> 00:31:17,070
when it's given some NA values as we
could some regular values like these.

627
00:31:17,070 --> 00:31:23,100
So let's go back now and add some new
tests and test cases to our file here.

628
00:31:23,100 --> 00:31:28,320
Now, if I want to test what average
does, when it's given some input that

629
00:31:28,320 --> 00:31:32,130
includes NA values, well, I
could keep adding test cases here

630
00:31:32,130 --> 00:31:35,820
to my single function, or my single
test, average calculates mean.

631
00:31:35,820 --> 00:31:39,630
But if I were to keep going
and adding more and more tests,

632
00:31:39,630 --> 00:31:42,480
this function would
become quite, quite long.

633
00:31:42,480 --> 00:31:48,120
So ideally, what I want to do instead is
maybe divide up my test, my test cases,

634
00:31:48,120 --> 00:31:50,490
into a way that makes logical sense.

635
00:31:50,490 --> 00:31:54,960
Here, I argue, I'm going to have all
my test cases that are giving average

636
00:31:54,960 --> 00:31:59,250
some pretty typical inputs, numbers,
I'm going to find the average of them,

637
00:31:59,250 --> 00:32:01,320
and get back and check for equality.

638
00:32:01,320 --> 00:32:05,220
But if I'm going to give average
some new type of input, like inputs

639
00:32:05,220 --> 00:32:09,330
that include NAs, well, maybe I
should make a new test for that.

640
00:32:09,330 --> 00:32:14,280
And I can do that by including more than
one instance of this testthat function.

641
00:32:14,280 --> 00:32:16,440
I could say I want to test that.

642
00:32:16,440 --> 00:32:20,700
Now, average, let's say,
how do I want to word this?

643
00:32:20,700 --> 00:32:27,240
I want to say test_that average
warns about NAs in input.

644
00:32:27,240 --> 00:32:31,760
So we saw before that our goal,
when we wrote the average function,

645
00:32:31,760 --> 00:32:36,590
was to test and to make sure that
it gave us a warning when the input

646
00:32:36,590 --> 00:32:38,670
x included NA values.

647
00:32:38,670 --> 00:32:41,210
So we could write some
test cases to make sure

648
00:32:41,210 --> 00:32:44,460
that that is what is happening
with our average function.

649
00:32:44,460 --> 00:32:46,460
So here's my description of this test.

650
00:32:46,460 --> 00:32:49,100
I'll go ahead and give myself
some space now for test cases.

651
00:32:49,100 --> 00:32:55,580
And I want to test that average
raises or throws a warning here.

652
00:32:55,580 --> 00:32:59,480
And it seems like expecting
equality might not work because this

653
00:32:59,480 --> 00:33:01,730
allows me to test two distinct values.

654
00:33:01,730 --> 00:33:04,430
But a warning is
something else entirely.

655
00:33:04,430 --> 00:33:09,530
Well, thankfully, in testthat, we
have access to other expectations

656
00:33:09,530 --> 00:33:14,540
we can say, one including test warning,
or expect_warning in this case.

657
00:33:14,540 --> 00:33:17,840
We can say we want to expect
a warning from this function

658
00:33:17,840 --> 00:33:20,650
or expect no warning at all.

659
00:33:20,650 --> 00:33:23,760
So let me go over here
and say I want to expect

660
00:33:23,760 --> 00:33:27,720
that I'll get a warning
from the average function

661
00:33:27,720 --> 00:33:32,580
when I give it some input
like this, maybe 1, NA, and 3.

662
00:33:32,580 --> 00:33:34,710
So some input that involves an NA.

663
00:33:34,710 --> 00:33:36,270
I can do the same thing down below.

664
00:33:36,270 --> 00:33:39,780
And I could say, why don't I
give it maybe all NAs here,

665
00:33:39,780 --> 00:33:41,430
a vector of three NAs?

666
00:33:41,430 --> 00:33:45,420
And now I could expect that
when I run average in this way,

667
00:33:45,420 --> 00:33:47,460
I expect I'll get a warning.

668
00:33:47,460 --> 00:33:50,280
So let's go ahead and
rerun our tests now,

669
00:33:50,280 --> 00:33:54,622
testing for both a calculation of
the mean and a warning from average.

670
00:33:54,622 --> 00:33:55,830
Let me go ahead and run this.

671
00:33:55,830 --> 00:33:58,950
And oh, what do we see?

672
00:33:58,950 --> 00:34:00,107
Test failed.

673
00:34:00,107 --> 00:34:01,440
So let's see what happened here.

674
00:34:01,440 --> 00:34:05,490
If I scroll back up, I'll
see that one test passed.

675
00:34:05,490 --> 00:34:08,040
That seems to be my first
one here, average still

676
00:34:08,040 --> 00:34:09,300
seems to calculate the mean.

677
00:34:09,300 --> 00:34:13,440
But if I look down below,
my next test is that average

678
00:34:13,440 --> 00:34:15,780
warns about NAs in the input.

679
00:34:15,780 --> 00:34:19,260
And in fact, what I've gotten
it seems, from average,

680
00:34:19,260 --> 00:34:23,760
is not a warning, but an
error, error in average.

681
00:34:23,760 --> 00:34:27,989
When I gave it NA, NA, NA,
x must be a numeric vector.

682
00:34:27,989 --> 00:34:32,130
So although I expected
a warning on line 12,

683
00:34:32,130 --> 00:34:35,560
it seems like I got an error instead.

684
00:34:35,560 --> 00:34:38,040
So that's probably cause for
me to go back to my code,

685
00:34:38,040 --> 00:34:39,960
and see what could
happen, so I could fix it,

686
00:34:39,960 --> 00:34:43,920
and make sure it adheres to these
expectations of my function.

687
00:34:43,920 --> 00:34:46,620
Let me come back now
to average.R and think

688
00:34:46,620 --> 00:34:48,969
through what could be going wrong here.

689
00:34:48,969 --> 00:34:52,650
Well, we got, it seems,
this error, that x

690
00:34:52,650 --> 00:34:56,100
must be a numeric vector,
when we wanted, it seems,

691
00:34:56,100 --> 00:34:58,480
this warning down below.

692
00:34:58,480 --> 00:35:04,080
So maybe what happened is that when
I gave it a vector of all NAs, maybe

693
00:35:04,080 --> 00:35:08,290
it found that that vector is not
numeric, which it might well have done.

694
00:35:08,290 --> 00:35:10,840
So let me go ahead and get on
my console here and test this.

695
00:35:10,840 --> 00:35:16,140
I could say is.numeric, and give
as input NA, NA, NA, and ask

696
00:35:16,140 --> 00:35:17,920
is that numeric or not?

697
00:35:17,920 --> 00:35:19,870
Hmm, so it's not.

698
00:35:19,870 --> 00:35:22,840
So because this vector
of NAs is not numeric,

699
00:35:22,840 --> 00:35:28,270
I would first throw my error
that x must be a numeric vector.

700
00:35:28,270 --> 00:35:33,070
But what I really want to do is
return NA if I get a vector of NAs.

701
00:35:33,070 --> 00:35:36,580
So I think I should probably reorder
here this handling of my errors.

702
00:35:36,580 --> 00:35:40,220
Let me go ahead and reorder
this and put this up top first.

703
00:35:40,220 --> 00:35:43,300
So now, what we'll do is first check.

704
00:35:43,300 --> 00:35:48,460
Is NA-- or is the vector-- is the
vector we got as input to average here,

705
00:35:48,460 --> 00:35:49,960
does it include any values?

706
00:35:49,960 --> 00:35:52,480
If so, we'll raise a
warning and return an NA.

707
00:35:52,480 --> 00:35:54,760
And then we'll check if it's numeric.

708
00:35:54,760 --> 00:35:57,380
I think this might help
us solve our problem here.

709
00:35:57,380 --> 00:36:01,000
Let me go back to test-average.R. Let
me rerun these tests with our updated

710
00:36:01,000 --> 00:36:02,830
version of average.R.

711
00:36:02,830 --> 00:36:05,920
And we'll see all the tests passed.

712
00:36:05,920 --> 00:36:07,690
We have a little
confetti, some rainbows.

713
00:36:07,690 --> 00:36:10,700
We seem to be moving along pretty well.

714
00:36:10,700 --> 00:36:14,920
So what questions do we have on
this new version of our test?

715
00:36:14,920 --> 00:36:17,690
We've expected how average
will calculate the mean.

716
00:36:17,690 --> 00:36:19,820
And will we get a warning, now?

717
00:36:19,820 --> 00:36:22,010
What should we do next?

718
00:36:22,010 --> 00:36:26,060
And what questions do we
have before we move on?

719
00:36:26,060 --> 00:36:28,310
AUDIENCE: When you get--
when the user gets a warning,

720
00:36:28,310 --> 00:36:34,090
can we use pass to get the user to
rerun the input to have a warning in--

721
00:36:34,090 --> 00:36:34,828
or an error?

722
00:36:34,828 --> 00:36:38,120
CARTER ZENKE: So I'm hearing a question
about handling these errors or warnings

723
00:36:38,120 --> 00:36:39,920
as they come up in our code.

724
00:36:39,920 --> 00:36:42,770
And it turns out that R actually
has a function called try

725
00:36:42,770 --> 00:36:46,880
and a function called try_catch that let
us handle errors and warnings as they

726
00:36:46,880 --> 00:36:47,970
arise in our code.

727
00:36:47,970 --> 00:36:50,720
We won't focus on those today, but
certainly learn more about them

728
00:36:50,720 --> 00:36:52,718
if you're curious about them too.

729
00:36:52,718 --> 00:36:55,010
Let's keep going here and
see what else we should test.

730
00:36:55,010 --> 00:36:59,090
So I argue that we've tested that
average returns us the right value

731
00:36:59,090 --> 00:37:02,930
to calculate the mean, that it
also warns about NAs in input.

732
00:37:02,930 --> 00:37:04,610
But what else could we test?

733
00:37:04,610 --> 00:37:08,990
Well, it seems like we should also
test that average actually returns

734
00:37:08,990 --> 00:37:12,260
to us NA if we give it NA value.

735
00:37:12,260 --> 00:37:14,660
Here, we're only expecting a warning.

736
00:37:14,660 --> 00:37:18,510
And we're not so much testing if we're
getting back the right return value.

737
00:37:18,510 --> 00:37:19,820
So let me do just that.

738
00:37:19,820 --> 00:37:26,000
I'll go ahead and add a new test case,
one that tests that average returns NA.

739
00:37:26,000 --> 00:37:27,320
I'll test that here.

740
00:37:27,320 --> 00:37:34,260
Average returns NA with NAs
in our input, just like this.

741
00:37:34,260 --> 00:37:37,550
And I'll go ahead and
add some more test cases.

742
00:37:37,550 --> 00:37:41,330
Well here, it seems like
expect_equal might work for me.

743
00:37:41,330 --> 00:37:44,240
I'm going to test that the
return value of average

744
00:37:44,240 --> 00:37:46,680
will be equal to an NA value.

745
00:37:46,680 --> 00:37:49,280
So I'll go ahead and expect_equal.

746
00:37:49,280 --> 00:37:51,230
I'll go ahead and give
the same kind of input

747
00:37:51,230 --> 00:37:54,140
I gave down below here, 1, NA, and 3.

748
00:37:54,140 --> 00:37:57,620
I'll expect that to be equal now to NA.

749
00:37:57,620 --> 00:38:02,750
Let me go down over here and say
I expect_equal between this vector

750
00:38:02,750 --> 00:38:05,900
of just all NA values
and the NA value itself.

751
00:38:05,900 --> 00:38:09,800
Let me go ahead and actually make
this a vector, just like that.

752
00:38:09,800 --> 00:38:11,370
Make this a vector as well.

753
00:38:11,370 --> 00:38:14,990
And now we're passing into average
those same inputs down below.

754
00:38:14,990 --> 00:38:18,710
But now, I'm testing to see
if the return value is NA.

755
00:38:18,710 --> 00:38:22,400
I'll go back to my console now and
run these tests that I've just added.

756
00:38:22,400 --> 00:38:26,690
And we'll see, hmm, something
a little bit curious.

757
00:38:26,690 --> 00:38:29,210
I'll see test passed.

758
00:38:29,210 --> 00:38:30,890
And I'll see test passed.

759
00:38:30,890 --> 00:38:34,010
But I'll see a warning, it
seems, in my second test.

760
00:38:34,010 --> 00:38:36,980
That average returns
NA with NA as an input.

761
00:38:36,980 --> 00:38:42,560
And I get back a warning that x
contains one or more NA values.

762
00:38:42,560 --> 00:38:46,700
Well, that is kind of expected because
if we look at our average function,

763
00:38:46,700 --> 00:38:50,390
we'll see that if we do
give our function NA values,

764
00:38:50,390 --> 00:38:52,490
we're going to throw this warning.

765
00:38:52,490 --> 00:38:55,550
But what we're testing
in this test is not

766
00:38:55,550 --> 00:38:58,910
so much that we get a warning or
not, we already did that down below.

767
00:38:58,910 --> 00:39:03,080
We're testing if the return
value is equal to NA.

768
00:39:03,080 --> 00:39:07,142
So this would be a good chance for us
to use a function like suppressWarnings.

769
00:39:07,142 --> 00:39:09,350
We're saying, we don't really
care about the warning,

770
00:39:09,350 --> 00:39:12,840
we get we just want to test
the return value in this case.

771
00:39:12,840 --> 00:39:19,460
So I'll wrap the average function, in
this case, inside of suppressWarnings--

772
00:39:19,460 --> 00:39:22,150
suppressWarning--

773
00:39:22,150 --> 00:39:25,480
suppressWarnings, sorry, let
me make it plural up above.

774
00:39:25,480 --> 00:39:28,330
And now, I think we should
probably solve our problem here.

775
00:39:28,330 --> 00:39:29,830
I'll go ahead and rerun these tests.

776
00:39:29,830 --> 00:39:33,530
And now, we'll see, I have
three tests passing overall.

777
00:39:33,530 --> 00:39:34,780
Well, what else could we test?

778
00:39:34,780 --> 00:39:38,650
We saw before in average.R
that we also want to stop.

779
00:39:38,650 --> 00:39:41,020
We want to end our
function, throw an error

780
00:39:41,020 --> 00:39:43,660
if we give it some non-numeric input.

781
00:39:43,660 --> 00:39:45,790
We could just as well test for that.

782
00:39:45,790 --> 00:39:48,670
I have this other set of
functions, thanks to testthat.

783
00:39:48,670 --> 00:39:52,120
One is called expect_error
and expect_no_error,

784
00:39:52,120 --> 00:39:56,390
those test if my function has
stopped given some given input.

785
00:39:56,390 --> 00:39:58,310
So I could use now expect_error.

786
00:39:58,310 --> 00:40:01,450
We come back to test-average,
and go down below and say,

787
00:40:01,450 --> 00:40:07,150
I'll test that maybe average
stops if x, our input,

788
00:40:07,150 --> 00:40:09,618
is non-numeric, just like this.

789
00:40:09,618 --> 00:40:11,410
And now, I'll go ahead
and expect that I'll

790
00:40:11,410 --> 00:40:15,940
get back some error if I give,
as input to the average function,

791
00:40:15,940 --> 00:40:19,180
some value that is not
numeric, maybe something like--

792
00:40:19,180 --> 00:40:23,260
something like quack, just like
this, or something like that test

793
00:40:23,260 --> 00:40:28,870
we saw before, 1 as a character, 2
as a character, and 3 as a character.

794
00:40:28,870 --> 00:40:30,760
So this is non-numeric input.

795
00:40:30,760 --> 00:40:33,250
Let's see if we get back
the error from average.

796
00:40:33,250 --> 00:40:34,270
I'll go ahead and run.

797
00:40:34,270 --> 00:40:35,470
I'll first save my file.

798
00:40:35,470 --> 00:40:37,510
I'll then run test-average.

799
00:40:37,510 --> 00:40:40,360
And now, I'll see four
tests passing too.

800
00:40:40,360 --> 00:40:44,530
So here, we've seen how to write
test cases for our code thanks

801
00:40:44,530 --> 00:40:49,060
to expect_warning,
expect_equal, and expect_error.

802
00:40:49,060 --> 00:40:51,880
What other questions
do we have on testing

803
00:40:51,880 --> 00:40:57,610
our code using these kinds of test
cases and testthat more generally?

804
00:40:57,610 --> 00:41:01,090
AUDIENCE: One thing which comes
up very much in computer science

805
00:41:01,090 --> 00:41:03,550
is floating point inaccuracies, right?

806
00:41:03,550 --> 00:41:05,908
So can we account for that?

807
00:41:05,908 --> 00:41:07,450
CARTER ZENKE: A really good question.

808
00:41:07,450 --> 00:41:10,117
Actually, an excellent segue I
was just going to talk about now.

809
00:41:10,117 --> 00:41:14,560
So it seems like in our code,
we are testing integer numbers.

810
00:41:14,560 --> 00:41:16,420
I have here, 1, 2, and 3.

811
00:41:16,420 --> 00:41:18,650
And we get back a whole number, like 2.

812
00:41:18,650 --> 00:41:20,300
Same for all these other test cases.

813
00:41:20,300 --> 00:41:23,540
But to your point, we've
missed an important kind

814
00:41:23,540 --> 00:41:27,847
of test case, which involves these
floating point or decimal numbers.

815
00:41:27,847 --> 00:41:29,930
And due to what you've
said about decimal numbers,

816
00:41:29,930 --> 00:41:31,970
and the way they're represented,
there are some special considerations

817
00:41:31,970 --> 00:41:34,980
we take into account before we
can test those kinds of numbers.

818
00:41:34,980 --> 00:41:37,820
So let's see what we should take
into account before we test,

819
00:41:37,820 --> 00:41:40,590
in this case, floating
point or decimal values.

820
00:41:40,590 --> 00:41:45,530
So let's actually go ahead over here
and think through how we could test it,

821
00:41:45,530 --> 00:41:46,850
at least hypothetically.

822
00:41:46,850 --> 00:41:50,030
I could use here expect_equal still.

823
00:41:50,030 --> 00:41:53,150
And give now as input
the average function.

824
00:41:53,150 --> 00:41:55,880
But I'll give it some
floating point values.

825
00:41:55,880 --> 00:42:01,820
Maybe in this case, I will try
to test these, 0.1 and 0.5,

826
00:42:01,820 --> 00:42:05,900
taking the average of
this, which will be 0.3.

827
00:42:05,900 --> 00:42:11,210
Now, it seems to us, as humans, that if
we were to do this kind of calculation,

828
00:42:11,210 --> 00:42:13,310
we would get the
following kind of answer.

829
00:42:13,310 --> 00:42:20,810
That 0.1 plus 0.5 divided by
2 is equal to exactly 0.3.

830
00:42:20,810 --> 00:42:25,310
This is the average of
these numbers, 0.1 and 0.5.

831
00:42:25,310 --> 00:42:30,140
But it turns out that computers
can't do math exactly like this.

832
00:42:30,140 --> 00:42:33,740
That there are actually an infinite
number of floating point numbers,

833
00:42:33,740 --> 00:42:36,470
of decimal numbers, and
only a finite number of bits

834
00:42:36,470 --> 00:42:38,030
we can use to represent them.

835
00:42:38,030 --> 00:42:41,990
Which leads to this problem known
as floating-point imprecision.

836
00:42:41,990 --> 00:42:44,210
We have so many decimal
numbers to represent

837
00:42:44,210 --> 00:42:48,020
and only so few bits represent them
that we can't represent all of them

838
00:42:48,020 --> 00:42:49,130
precisely.

839
00:42:49,130 --> 00:42:52,040
And in fact, a computer,
even one running R,

840
00:42:52,040 --> 00:42:55,760
might perform that same
calculation and arrive at this,

841
00:42:55,760 --> 00:43:02,870
0.1 plus 0.5 divided by
2 is equal to 0.299999--

842
00:43:02,870 --> 00:43:05,790
lots of nines, then a lot
of other values after that.

843
00:43:05,790 --> 00:43:07,880
So not exactly 0.3.

844
00:43:07,880 --> 00:43:11,690
And so because of this, we need
to allow for some tolerance

845
00:43:11,690 --> 00:43:14,690
when we test our floating point values.

846
00:43:14,690 --> 00:43:17,040
But before we do that, let
me kind of prove to you

847
00:43:17,040 --> 00:43:19,100
that this is what's
happening in R, even.

848
00:43:19,100 --> 00:43:21,260
I'll come back to my computer here.

849
00:43:21,260 --> 00:43:23,750
And let's get an idea
of this imprecision

850
00:43:23,750 --> 00:43:26,090
that happens in floating point values.

851
00:43:26,090 --> 00:43:28,080
Let's go back down to my console here.

852
00:43:28,080 --> 00:43:34,820
And if I were to print, let's say,
this value 0.3, I'll get back 0.3.

853
00:43:34,820 --> 00:43:40,820
But if I push R just a little bit
and I ask it what really is 0.3?

854
00:43:40,820 --> 00:43:48,230
I could say print 0.3 and show me now
17 digits after the decimal point.

855
00:43:48,230 --> 00:43:51,530
Let's see what we get.

856
00:43:51,530 --> 00:43:56,570
0.29999999999-- so it
turns out that 0.3 is

857
00:43:56,570 --> 00:43:58,970
just one of those decimal
numbers we can't properly

858
00:43:58,970 --> 00:44:03,530
represent due to having so many
floating point values and so few bits

859
00:44:03,530 --> 00:44:04,670
to represent them.

860
00:44:04,670 --> 00:44:07,040
Now, because of this
reality, like we said,

861
00:44:07,040 --> 00:44:09,770
we do need to allow for
something called tolerance

862
00:44:09,770 --> 00:44:11,970
when we're testing code like this.

863
00:44:11,970 --> 00:44:16,850
Now, tolerance is the range of values I
will accept above or below my expected

864
00:44:16,850 --> 00:44:20,150
value as being equal
to that expected value.

865
00:44:20,150 --> 00:44:24,740
Mathematically, we could say this,
maybe my expected value is 0.3,

866
00:44:24,740 --> 00:44:27,380
but in terms of numbers
being equal to 0.3,

867
00:44:27,380 --> 00:44:30,590
I'll say any number between
this range here, plus or minus

868
00:44:30,590 --> 00:44:35,000
let's say 1 times 10 to the
negative 8, some small number here.

869
00:44:35,000 --> 00:44:37,760
Now, this number is our tolerance.

870
00:44:37,760 --> 00:44:40,310
And we can change it, if
we wanted to too, depending

871
00:44:40,310 --> 00:44:44,390
on our needs for precision, or how
we're presenting these numbers here.

872
00:44:44,390 --> 00:44:47,900
I'm gonna come back now to RStudio and
show you that expect_equal actually

873
00:44:47,900 --> 00:44:52,140
has a parameter called tolerance that
we could use to change this value here.

874
00:44:52,140 --> 00:44:53,330
I'll come back over.

875
00:44:53,330 --> 00:44:55,940
And let's look now at expect_equal.

876
00:44:55,940 --> 00:45:01,160
If I clear my console and go here,
and say, in this particular test,

877
00:45:01,160 --> 00:45:02,840
I want to set some tolerance here.

878
00:45:02,840 --> 00:45:05,780
I could say as another
parameter, tolerance,

879
00:45:05,780 --> 00:45:10,130
and set it equal to some small
value, let's say 1 times 10

880
00:45:10,130 --> 00:45:14,810
to the negative 8, which I represent
now as with 1e negative 8 here.

881
00:45:14,810 --> 00:45:18,027
Now, it turns out that
testthat gives you some fault

882
00:45:18,027 --> 00:45:19,610
tolerance that's already been decided.

883
00:45:19,610 --> 00:45:22,652
So I'm actually going to rely on them
to choose my tolerance for me here.

884
00:45:22,652 --> 00:45:24,980
But you could, if you
wanted to, override it

885
00:45:24,980 --> 00:45:27,950
with this parameter called tolerance.

886
00:45:27,950 --> 00:45:31,030
So we've seen now this idea
of floating-point imprecision,

887
00:45:31,030 --> 00:45:34,280
and this idea of tolerance, which we can
use to better test our floating point

888
00:45:34,280 --> 00:45:36,410
values inside of our tests.

889
00:45:36,410 --> 00:45:40,610
Let me ask now what questions we
have on either of these topics here.

890
00:45:40,610 --> 00:45:44,270
AUDIENCE: Why writing code spending
much time to test another code,

891
00:45:44,270 --> 00:45:47,328
while you can go to the code that
you wrote and just test it simply?

892
00:45:47,328 --> 00:45:48,870
CARTER ZENKE: A really good question.

893
00:45:48,870 --> 00:45:51,950
So maybe you're in the habit of kind of
just testing your code in the console,

894
00:45:51,950 --> 00:45:53,658
for instance, kind of
like I did earlier.

895
00:45:53,658 --> 00:45:56,900
If I come back over here, maybe
I wrote the average function

896
00:45:56,900 --> 00:45:59,750
and I decided I'm pretty
confident this will work.

897
00:45:59,750 --> 00:46:01,940
I'll assign it to be the average here.

898
00:46:01,940 --> 00:46:06,110
And I'll just run a few test cases,
like average c, 1, 2, 3 here.

899
00:46:06,110 --> 00:46:07,220
OK, that seems to work.

900
00:46:07,220 --> 00:46:12,680
Maybe I'll do average, and then maybe
c negative 1, negative 2, negative 3.

901
00:46:12,680 --> 00:46:14,460
That seems to work as well.

902
00:46:14,460 --> 00:46:17,820
Now, the reason you might
not do this and instead spend

903
00:46:17,820 --> 00:46:22,320
more time to write your own test is
simply just robustness of your tests.

904
00:46:22,320 --> 00:46:26,760
Notice, I can very quickly test for a
lot of different scenarios in my tests

905
00:46:26,760 --> 00:46:30,720
here and make sure that my
function works as it expects.

906
00:46:30,720 --> 00:46:34,500
It's also very useful if you're
collaborating with others.

907
00:46:34,500 --> 00:46:36,570
Let's say you're both--

908
00:46:36,570 --> 00:46:39,180
somebody else and you are both
writing the average function.

909
00:46:39,180 --> 00:46:41,640
Well, you could
collectively decide on what

910
00:46:41,640 --> 00:46:45,630
tests you will run to make sure that
the average function is correct.

911
00:46:45,630 --> 00:46:49,680
And if somebody later were to go into
the average function and make a change,

912
00:46:49,680 --> 00:46:52,260
they could test to make sure
that their change did not

913
00:46:52,260 --> 00:46:53,950
break the code altogether.

914
00:46:53,950 --> 00:46:56,160
So this is a good way to
standardize what it means

915
00:46:56,160 --> 00:46:58,390
for your code to be correct as well.

916
00:46:58,390 --> 00:47:03,540
But a good question on why would I even
spend time writing tests like these.

917
00:47:03,540 --> 00:47:08,100
OK, so I think we've so far seen a
lot of good tests for our code now.

918
00:47:08,100 --> 00:47:10,800
When we come back, we'll
see how to test not just

919
00:47:10,800 --> 00:47:14,500
numbers, like we did here,
but also strings as well

920
00:47:14,500 --> 00:47:18,720
and focus on these two philosophies, one
called test-driven development and one

921
00:47:18,720 --> 00:47:20,400
called behavior-driven development.

922
00:47:20,400 --> 00:47:22,290
We'll see you all in a few.

923
00:47:22,290 --> 00:47:23,550
Well, we're back.

924
00:47:23,550 --> 00:47:26,520
And so we're going to next focus
on testing these return values that

925
00:47:26,520 --> 00:47:30,180
will be strings, as well as focus
on a few philosophies of testing,

926
00:47:30,180 --> 00:47:33,510
namely test-driven development
and behavior-driven development.

927
00:47:33,510 --> 00:47:35,880
Now, what is test-driven development?

928
00:47:35,880 --> 00:47:40,170
Well, it is an answer to when and
how we should write our tests.

929
00:47:40,170 --> 00:47:42,390
And central to this
philosophy is that tests

930
00:47:42,390 --> 00:47:44,910
should be at the heart of
your development process.

931
00:47:44,910 --> 00:47:48,480
In fact, it even argues you
should probably be writing tests

932
00:47:48,480 --> 00:47:50,700
before you write the code.

933
00:47:50,700 --> 00:47:53,730
Now, let's consider here, I
want to write a function that

934
00:47:53,730 --> 00:47:56,670
says hello to a user, one like a greet.

935
00:47:56,670 --> 00:47:58,920
Well, to make that
happen, I should probably

936
00:47:58,920 --> 00:48:03,130
first write the tests for that code
and then write the code itself.

937
00:48:03,130 --> 00:48:06,540
So let's do just that
now over in RStudio.

938
00:48:06,540 --> 00:48:09,300
Come back over here,
and create a file that

939
00:48:09,300 --> 00:48:13,590
will test this function called
greet that doesn't yet exist.

940
00:48:13,590 --> 00:48:15,840
I'll do file.create.

941
00:48:15,840 --> 00:48:20,670
And then I'll say I want to create
this file called test-greet.R.

942
00:48:20,670 --> 00:48:22,870
And I'll say-- it was created here--

943
00:48:22,870 --> 00:48:26,490
I'll go ahead and go to my File
Explorer and open up test-greet.R.

944
00:48:26,490 --> 00:48:30,870
And now I could start defining
some tests to test this code that

945
00:48:30,870 --> 00:48:32,850
doesn't even exist yet.

946
00:48:32,850 --> 00:48:34,870
But why would I do that?

947
00:48:34,870 --> 00:48:39,300
Well, by writing tests, I make it
much clearer to me and to others

948
00:48:39,300 --> 00:48:42,030
what it is I want this code to do.

949
00:48:42,030 --> 00:48:45,000
And once I have in mind
what I want the code to do,

950
00:48:45,000 --> 00:48:48,580
I'm better able to
write that code itself.

951
00:48:48,580 --> 00:48:52,440
So the very first thing I probably
want to test here is what that test--

952
00:48:52,440 --> 00:48:58,140
the greet function that it can
say, let's say, hello to a user,

953
00:48:58,140 --> 00:48:59,070
just like this.

954
00:48:59,070 --> 00:49:02,310
That's the core part of this
greet function I'm going to write,

955
00:49:02,310 --> 00:49:04,740
that it says hello to a user.

956
00:49:04,740 --> 00:49:08,400
And now, I could define some test cases
to make sure that this is the reality.

957
00:49:08,400 --> 00:49:10,440
Why don't I go ahead
and define at least one?

958
00:49:10,440 --> 00:49:15,570
And I'll say I'm going to expect that
if I were to run greet and give it

959
00:49:15,570 --> 00:49:19,320
as input, Carter, I would
get back as the return value

960
00:49:19,320 --> 00:49:22,870
now hello, comma, space, Carter.

961
00:49:22,870 --> 00:49:27,510
So here is my very first test and
test case for this function greet

962
00:49:27,510 --> 00:49:29,460
that doesn't yet exist.

963
00:49:29,460 --> 00:49:33,090
I'm going to say that I want to
be able to use greet in a way

964
00:49:33,090 --> 00:49:37,260
that it passes in a user's name and
returns to me then the user's name,

965
00:49:37,260 --> 00:49:40,530
but with a prefix of
hello, comma, space.

966
00:49:40,530 --> 00:49:42,750
So that is our very first test.

967
00:49:42,750 --> 00:49:46,380
Let me go ahead and make sure
I include this eventual file

968
00:49:46,380 --> 00:49:51,300
that I will create called greet.R,
in which I'll define greet itself.

969
00:49:51,300 --> 00:49:55,050
And once I've done this, well, I
could probably start developing.

970
00:49:55,050 --> 00:49:57,923
Let me go ahead and now go
back to my File Explorer.

971
00:49:57,923 --> 00:50:00,840
And I could either create a new file
by hitting this plus button here.

972
00:50:00,840 --> 00:50:03,600
Or I could go ahead and do file.create.

973
00:50:03,600 --> 00:50:08,370
I'll use greet.R, hello.R. And then
I'll go ahead and go to File Explorer

974
00:50:08,370 --> 00:50:12,210
here and open up greet, this
blank canvas for me here.

975
00:50:12,210 --> 00:50:14,880
And now, I could define
my greet function

976
00:50:14,880 --> 00:50:18,950
to do exactly what I see
it should do in my test.

977
00:50:18,950 --> 00:50:21,740
Well, I'll say I have this
function here called greet.

978
00:50:21,740 --> 00:50:24,620
And I'll define it as a
function that takes some input.

979
00:50:24,620 --> 00:50:28,100
Maybe in this case, the input is
called to-- because we're going

980
00:50:28,100 --> 00:50:30,740
to say hello to someone in this case.

981
00:50:30,740 --> 00:50:34,730
I'll go ahead and say that this
function should return some value.

982
00:50:34,730 --> 00:50:38,810
And we've seen this function called
paste before that concatenates strings.

983
00:50:38,810 --> 00:50:40,610
I bet that's what we might need here.

984
00:50:40,610 --> 00:50:42,750
I'll use paste just like this.

985
00:50:42,750 --> 00:50:46,280
And I'll paste together
hello, comma, and then

986
00:50:46,280 --> 00:50:50,600
whatever value is supplied as input
to this function under the argument

987
00:50:50,600 --> 00:50:52,280
or parameter to.

988
00:50:52,280 --> 00:50:56,360
And notice how paste here will take
care of the space between the comma

989
00:50:56,360 --> 00:50:59,090
and whoever we're saying hello to.

990
00:50:59,090 --> 00:51:03,800
So now, thanks to my test, I have a very
clear idea of what this code should do.

991
00:51:03,800 --> 00:51:08,330
And if I were to run
this now, test-greet.R,

992
00:51:08,330 --> 00:51:10,850
we'll see that the test passed.

993
00:51:10,850 --> 00:51:13,970
So it seems like now,
greet is working for me.

994
00:51:13,970 --> 00:51:16,130
I could go ahead and
add more test cases.

995
00:51:16,130 --> 00:51:18,270
In fact, this is an iterative process.

996
00:51:18,270 --> 00:51:21,390
I might write some tests, write
some code, write some tests,

997
00:51:21,390 --> 00:51:22,230
write some code.

998
00:51:22,230 --> 00:51:24,000
Here now, I could test other names.

999
00:51:24,000 --> 00:51:29,910
Maybe I'll say expect_equal, I'll greet
Mario, and hope to see hello, Mario.

1000
00:51:29,910 --> 00:51:32,380
I'll do maybe Peach as well.

1001
00:51:32,380 --> 00:51:33,780
And hello to Peach.

1002
00:51:33,780 --> 00:51:35,610
I'm just choosing some
representative names

1003
00:51:35,610 --> 00:51:38,910
I might get now and pass
into this greet function.

1004
00:51:38,910 --> 00:51:41,350
Let's do Bowser as well.

1005
00:51:41,350 --> 00:51:46,170
So now, with these expanded test cases,
I'll go ahead and test my code again.

1006
00:51:46,170 --> 00:51:49,110
And I'll see that it
still seems to be working.

1007
00:51:49,110 --> 00:51:52,740
And now, if I were to modify
greet in my greet.R file,

1008
00:51:52,740 --> 00:51:57,030
I could very quickly test to make sure
I didn't break it with any changes

1009
00:51:57,030 --> 00:51:58,950
that I had made.

1010
00:51:58,950 --> 00:52:03,780
So this is one philosophy of
development, test-driven development.

1011
00:52:03,780 --> 00:52:07,440
But there is a related philosophy that
is still interesting to learn about,

1012
00:52:07,440 --> 00:52:10,290
one called behavior-driven development.

1013
00:52:10,290 --> 00:52:14,190
So test-driven development focuses
on designing these test cases

1014
00:52:14,190 --> 00:52:16,320
for our code, giving
it representative cases

1015
00:52:16,320 --> 00:52:20,130
and seeing if it actually follows
through on the expected values.

1016
00:52:20,130 --> 00:52:24,210
Behavior-driven development is
slightly different in that it requires

1017
00:52:24,210 --> 00:52:27,300
us to first define what
it is we want our function

1018
00:52:27,300 --> 00:52:30,420
to do and describe its behavior.

1019
00:52:30,420 --> 00:52:34,347
Now, testthat allows us to use
behavior-driven development

1020
00:52:34,347 --> 00:52:36,180
and actually gives us
a few functions we can

1021
00:52:36,180 --> 00:52:38,490
use that kind of operate a
bit like an English language

1022
00:52:38,490 --> 00:52:42,063
to define what it is our
function behavior should be.

1023
00:52:42,063 --> 00:52:43,230
Let me show you what I mean.

1024
00:52:43,230 --> 00:52:44,560
So I'll come back over here.

1025
00:52:44,560 --> 00:52:47,100
And the two functions
we have in testthat

1026
00:52:47,100 --> 00:52:53,310
to engage in behavior-driven
development are these, describe and it.

1027
00:52:53,310 --> 00:52:57,870
Where describe is a way of describing
what it is we want our function to do.

1028
00:52:57,870 --> 00:53:01,320
And it is a way of saying that
our function should do something

1029
00:53:01,320 --> 00:53:02,040
in particular.

1030
00:53:02,040 --> 00:53:06,330
So let's go ahead and switch now
to this philosophy of testing.

1031
00:53:06,330 --> 00:53:08,400
I'll avoid using testthat.

1032
00:53:08,400 --> 00:53:10,800
And now, I'll start using describe.

1033
00:53:10,800 --> 00:53:14,800
So in particular, describe
lets me do something like this.

1034
00:53:14,800 --> 00:53:18,840
I want to describe
now my greet function.

1035
00:53:18,840 --> 00:53:23,850
And by convention, stylistically, I'll
include these empty curly braces here.

1036
00:53:23,850 --> 00:53:27,990
And now I can say inside
these curly braces,

1037
00:53:27,990 --> 00:53:30,840
what it is I want my function to do.

1038
00:53:30,840 --> 00:53:33,960
Inside these curly braces, I'll
describe what my function now

1039
00:53:33,960 --> 00:53:36,900
should do using this it function.

1040
00:53:36,900 --> 00:53:41,910
I could say that it, in this
case, can say hello to a user.

1041
00:53:41,910 --> 00:53:45,060
And as a second argument
now to it, I'll provide

1042
00:53:45,060 --> 00:53:50,440
some test cases that show examples of it
saying hello to the user, namely this.

1043
00:53:50,440 --> 00:53:54,930
I could say maybe I have
this object called name,

1044
00:53:54,930 --> 00:53:56,850
I'll set it equal to Carter.

1045
00:53:56,850 --> 00:54:02,280
And I'll expect that when I run
greet and pass as input name,

1046
00:54:02,280 --> 00:54:04,500
I should see hello, Carter.

1047
00:54:04,500 --> 00:54:06,150
Similar now to before.

1048
00:54:06,150 --> 00:54:08,130
But notice what it is we've done.

1049
00:54:08,130 --> 00:54:10,410
We've kind of used a
bit of English language

1050
00:54:10,410 --> 00:54:12,060
in the form of these functions.

1051
00:54:12,060 --> 00:54:15,210
Here, I'm going to
describe my greet function.

1052
00:54:15,210 --> 00:54:16,410
Well, what should it do?

1053
00:54:16,410 --> 00:54:18,450
It can say hello to a user.

1054
00:54:18,450 --> 00:54:21,280
And here's an example
of it doing just that.

1055
00:54:21,280 --> 00:54:22,620
So pretty cool.

1056
00:54:22,620 --> 00:54:26,550
Let me go ahead and now run test-greet,
using this other philosophy here,

1057
00:54:26,550 --> 00:54:31,620
test-greet.R. And I'll
see that test has passed.

1058
00:54:31,620 --> 00:54:33,630
What else could our function do?

1059
00:54:33,630 --> 00:54:36,060
What kind of behavior do
we want it to exhibit?

1060
00:54:36,060 --> 00:54:40,350
Well, maybe I could also say
that it, describing greet now,

1061
00:54:40,350 --> 00:54:44,350
can say hello to the
world, just like this.

1062
00:54:44,350 --> 00:54:47,520
And now, provide an example of
it saying hello to the world.

1063
00:54:47,520 --> 00:54:53,490
I'll expect equal that when I run
greet, without any input at all,

1064
00:54:53,490 --> 00:54:55,350
I'll say hello to the world.

1065
00:54:55,350 --> 00:54:56,130
I get that back--

1066
00:54:56,130 --> 00:54:59,340
I'll get that back out
as a return value now.

1067
00:54:59,340 --> 00:55:03,690
So here, we see a fuller version
of a description of greet.

1068
00:55:03,690 --> 00:55:05,730
First, it can say hello to a user.

1069
00:55:05,730 --> 00:55:07,920
And it can say hello to the world.

1070
00:55:07,920 --> 00:55:12,420
And by convention, I'm using
this can say, can say here

1071
00:55:12,420 --> 00:55:16,170
because when we use it, it reads more
like English to say it can do this,

1072
00:55:16,170 --> 00:55:17,140
it can do that.

1073
00:55:17,140 --> 00:55:19,300
So I can mention here,
we're just using these--

1074
00:55:19,300 --> 00:55:20,230
that grammar here.

1075
00:55:20,230 --> 00:55:24,320
But I could use any kind of text
inside this it function here.

1076
00:55:24,320 --> 00:55:25,820
Let me go ahead and try to run this.

1077
00:55:25,820 --> 00:55:28,810
I'll say source
test-greet.R, and we get--

1078
00:55:28,810 --> 00:55:34,090
oop-- seems like, if I scroll up now,
that one of our tests has failed.

1079
00:55:34,090 --> 00:55:37,210
So here we see error in greet.

1080
00:55:37,210 --> 00:55:38,740
Greet can say hello to the world.

1081
00:55:38,740 --> 00:55:41,800
But we actually get back an error,
and not, in this case, hello, world.

1082
00:55:41,800 --> 00:55:48,010
Error in greet argument to
is missing with no default.

1083
00:55:48,010 --> 00:55:52,600
So let's look now at greet.R and
see what could have happened here.

1084
00:55:52,600 --> 00:55:56,230
I ran greet with no input.

1085
00:55:56,230 --> 00:55:59,422
And if I go look at greet
itself, well, now it

1086
00:55:59,422 --> 00:56:01,630
kind of makes sense because
I didn't supply a default

1087
00:56:01,630 --> 00:56:04,630
value for to if none is supplied.

1088
00:56:04,630 --> 00:56:05,590
So what should I do?

1089
00:56:05,590 --> 00:56:07,360
Maybe supply a default value here?

1090
00:56:07,360 --> 00:56:11,500
I could go ahead and say that
to has a default value of world.

1091
00:56:11,500 --> 00:56:15,100
And fix this code after
I have described it how--

1092
00:56:15,100 --> 00:56:17,230
described how it should work already.

1093
00:56:17,230 --> 00:56:19,270
Let me go ahead and
now rerun these tests

1094
00:56:19,270 --> 00:56:21,610
with this updated version of greet.

1095
00:56:21,610 --> 00:56:24,370
And we'll see that both
of my tests have passed.

1096
00:56:24,370 --> 00:56:26,290
That greet can say hello to a user.

1097
00:56:26,290 --> 00:56:29,840
And it can say hello to the world.

1098
00:56:29,840 --> 00:56:33,520
So we've seen now how to test
these strings for equality,

1099
00:56:33,520 --> 00:56:36,910
how to use test-driven development
and behavior-driven development.

1100
00:56:36,910 --> 00:56:39,730
So what questions do we have
on test-driven development

1101
00:56:39,730 --> 00:56:43,630
or behavior-driven development?

1102
00:56:43,630 --> 00:56:46,480
Seeing none, so let's focus on
last-- one last topic for today,

1103
00:56:46,480 --> 00:56:48,670
one called test coverage.

1104
00:56:48,670 --> 00:56:51,580
Now, the goal of today has
been to help you write tests

1105
00:56:51,580 --> 00:56:53,560
that systematically test your code.

1106
00:56:53,560 --> 00:56:57,640
And one measure you can use to figure
out how much of the code you're testing

1107
00:56:57,640 --> 00:56:59,770
is one called test coverage.

1108
00:56:59,770 --> 00:57:03,310
When you have programs that are
composed of not just one function or two

1109
00:57:03,310 --> 00:57:06,730
but many, test coverage and tell
you how many of those functions

1110
00:57:06,730 --> 00:57:08,770
you've tested reliably.

1111
00:57:08,770 --> 00:57:13,030
Now, we've seen today how to
spot errors in our programs,

1112
00:57:13,030 --> 00:57:16,840
how to handle those errors, and
how to write tests to test our code

1113
00:57:16,840 --> 00:57:18,952
to ensure it behaves as we intend.

1114
00:57:18,952 --> 00:57:20,660
When we come back,
we'll go ahead and see

1115
00:57:20,660 --> 00:57:23,660
how we can package our code up
and share it with the world.

1116
00:57:23,660 --> 00:57:25,750
We'll see you then.

1117
00:57:25,750 --> 00:57:27,000