1
00:00:00,000 --> 00:00:00,940

2
00:00:00,940 --> 00:00:05,440
>> [MUSIC PLAYING]

3
00:00:05,440 --> 00:00:11,577

4
00:00:11,577 --> 00:00:12,660
DAVID J. MALAN: All right.

5
00:00:12,660 --> 00:00:15,590
This is CS50, and this
is the start of week two.

6
00:00:15,590 --> 00:00:19,120
So let us begin today with a bug.

7
00:00:19,120 --> 00:00:20,974
A bug, of course, is a
mistake in a program,

8
00:00:20,974 --> 00:00:22,890
and you'll get very
familiar with this concept

9
00:00:22,890 --> 00:00:26,050
if you've never programmed
before. pset0 and now pset1.

10
00:00:26,050 --> 00:00:29,280
But let's consider something
a little simple at first.

11
00:00:29,280 --> 00:00:32,189
This program here that I
threw together in advance,

12
00:00:32,189 --> 00:00:37,280
and I claim that this should print
10 stars on the screen using printf,

13
00:00:37,280 --> 00:00:41,020
but it's apparently buggy in some way.

14
00:00:41,020 --> 00:00:45,370
>> Given that specification that
it should print 10 stars,

15
00:00:45,370 --> 00:00:50,230
but it doesn't apparently, what
would you claim is the bug?

16
00:00:50,230 --> 00:00:52,004
Yeah?

17
00:00:52,004 --> 00:00:54,420
So it's an off by one error,
and what do you mean by that?

18
00:00:54,420 --> 00:01:00,991

19
00:01:00,991 --> 00:01:01,490
OK.

20
00:01:01,490 --> 00:01:09,820

21
00:01:09,820 --> 00:01:10,410
Excellent.

22
00:01:10,410 --> 00:01:13,930
So we've specified a
start value of zero for i,

23
00:01:13,930 --> 00:01:18,399
and we've specified an n value of 10,
but we've used less than or equal to.

24
00:01:18,399 --> 00:01:21,190
And the reason that this is two
characters and not just one symbol,

25
00:01:21,190 --> 00:01:22,630
like in a math book,
is that you don't have

26
00:01:22,630 --> 00:01:24,880
a way of expressing the
one character equivalent.

27
00:01:24,880 --> 00:01:28,450
>> So that means less than, but
if you start counting at zero,

28
00:01:28,450 --> 00:01:31,690
but you count all the way
up through and equal to 10,

29
00:01:31,690 --> 00:01:34,170
you're of course going to
count 11 things in total.

30
00:01:34,170 --> 00:01:35,900
And so you're going to print 11 stars.

31
00:01:35,900 --> 00:01:37,990
So what might be a fix for this?

32
00:01:37,990 --> 00:01:39,970
Yeah?

33
00:01:39,970 --> 00:01:43,980
>> So just adjust the less than
or equal to just be less than,

34
00:01:43,980 --> 00:01:46,250
and there's, I claim, perhaps
another solution, too.

35
00:01:46,250 --> 00:01:47,210
What might else you do?

36
00:01:47,210 --> 00:01:48,590
Yeah?

37
00:01:48,590 --> 00:01:53,660
>> So start equaling it to 1, and
leave the less than or equal to.

38
00:01:53,660 --> 00:01:56,187
And frankly I would claim
that, for a typical human,

39
00:01:56,187 --> 00:01:57,770
this is probably more straightforward.

40
00:01:57,770 --> 00:02:00,280
Start counting at 1 and
count up through 10.

41
00:02:00,280 --> 00:02:01,690
Essentially do what you mean.

42
00:02:01,690 --> 00:02:04,010
>> But the reality is in
programming, as we've seen,

43
00:02:04,010 --> 00:02:07,598
computer scientists and programmers
generally do start counting at zero.

44
00:02:07,598 --> 00:02:09,389
And so that's fine once
you get used to it.

45
00:02:09,389 --> 00:02:12,640
Your condition will generally
be something like less than.

46
00:02:12,640 --> 00:02:14,910
So simply a logical
error that we could now

47
00:02:14,910 --> 00:02:17,990
fix and ultimately recompile
this and get just 10.

48
00:02:17,990 --> 00:02:19,610
>> Well how about this bug here?

49
00:02:19,610 --> 00:02:24,200
Here, again, I claim that I have
a goal of printing 10 stars--

50
00:02:24,200 --> 00:02:28,140
one per line this time, but it doesn't.

51
00:02:28,140 --> 00:02:30,940
Before we propose what
the fix is, what does this

52
00:02:30,940 --> 00:02:34,640
print visually if I were to compile
and run this program do you think?

53
00:02:34,640 --> 00:02:35,140
Yeah?

54
00:02:35,140 --> 00:02:38,360

55
00:02:38,360 --> 00:02:38,860
>> Star.

56
00:02:38,860 --> 00:02:41,690
So all the stars on the
same line is what I heard,

57
00:02:41,690 --> 00:02:43,391
and then the new line character.

58
00:02:43,391 --> 00:02:44,140
So let's try that.

59
00:02:44,140 --> 00:02:48,710
So make buggy-1, enter,
and I see the clang command

60
00:02:48,710 --> 00:02:50,090
that we talked about last time.

61
00:02:50,090 --> 00:02:55,180
./buggy-1, and indeed I see all 10 stars
on the same line even though I claim

62
00:02:55,180 --> 00:02:58,690
in my specification just a comment atop
the code that I intended to do one per

63
00:02:58,690 --> 00:02:59,230
line.

64
00:02:59,230 --> 00:03:00,580
But this looks right.

65
00:03:00,580 --> 00:03:04,620
>> Now line 15 it looks like I'm
printing a star, and then line 16

66
00:03:04,620 --> 00:03:06,620
it looks like I'm printing
a new line character,

67
00:03:06,620 --> 00:03:09,560
and they're both indented so
I'm inside of the loop clearly.

68
00:03:09,560 --> 00:03:13,610
So shouldn't I be doing star, new
line, star, new line, star, new line?

69
00:03:13,610 --> 00:03:14,110
Yes?

70
00:03:14,110 --> 00:03:18,430

71
00:03:18,430 --> 00:03:21,240
>> Yeah, unlike a language like
Python, if you're familiar,

72
00:03:21,240 --> 00:03:23,540
indentation doesn't
matter to the computer.

73
00:03:23,540 --> 00:03:25,280
It only matters to the human.

74
00:03:25,280 --> 00:03:29,860
So whereas here I've invented lines
15 and 16-- that looks beautiful,

75
00:03:29,860 --> 00:03:31,330
but the computer doesn't care.

76
00:03:31,330 --> 00:03:34,640
The computer cares about
actually having curly braces

77
00:03:34,640 --> 00:03:36,310
around these lines of code.

78
00:03:36,310 --> 00:03:39,520
>> So that it's clear-- just like in
Scratch-- that those two lines of code

79
00:03:39,520 --> 00:03:40,450
should be executed.

80
00:03:40,450 --> 00:03:44,390
Like one of those yellow Scratch puzzle
pieces again and again and again.

81
00:03:44,390 --> 00:03:50,920
>> So now if I re-run this
program-- ./buggy-2-- Hm.

82
00:03:50,920 --> 00:03:51,770
I have an error now.

83
00:03:51,770 --> 00:03:54,212
What did I forget to do?

84
00:03:54,212 --> 00:03:55,420
Yeah, so I didn't compile it.

85
00:03:55,420 --> 00:03:56,740
So make buggy-2.

86
00:03:56,740 --> 00:03:59,840
No such file because I didn't
actually compile the second version.

87
00:03:59,840 --> 00:04:04,860
So now interesting
undeclared variable-- not 2.

88
00:04:04,860 --> 00:04:05,510
We're doing 1.

89
00:04:05,510 --> 00:04:11,050
Make buggy-1-- ./buggy-1-- and now
each of them is on the same line.

90
00:04:11,050 --> 00:04:13,880
>> Now there is an exception to
this supposed claim of mine

91
00:04:13,880 --> 00:04:15,520
that you need these curly braces.

92
00:04:15,520 --> 00:04:20,160
When is it actually OK-- if you've
noticed in section or textbooks--

93
00:04:20,160 --> 00:04:22,130
to omit the curly braces?

94
00:04:22,130 --> 00:04:22,630
Yeah?

95
00:04:22,630 --> 00:04:26,290

96
00:04:26,290 --> 00:04:26,870
>> Exactly.

97
00:04:26,870 --> 00:04:28,940
When there's only one
line of code that you

98
00:04:28,940 --> 00:04:32,830
want to be associated with the
loop as in our first example.

99
00:04:32,830 --> 00:04:36,380
It is perfectly legitimate
to omit the curly braces

100
00:04:36,380 --> 00:04:40,310
just as sort of a convenience
from the compiler to you.

101
00:04:40,310 --> 00:04:40,810
Yeah?

102
00:04:40,810 --> 00:04:43,347

103
00:04:43,347 --> 00:04:43,930
Good question.

104
00:04:43,930 --> 00:04:45,500
Would it be considered a style error?

105
00:04:45,500 --> 00:04:49,340
We would promote-- as in CS50
style guide, the URL for which

106
00:04:49,340 --> 00:04:51,926
is in pset1-- that always
use the curly braces.

107
00:04:51,926 --> 00:04:53,550
Certainly if you're new to programming.

108
00:04:53,550 --> 00:04:56,800
The reality is we're not
going to prohibit you

109
00:04:56,800 --> 00:04:58,680
from doing these conveniences.

110
00:04:58,680 --> 00:05:00,846
But if you're just getting
into the swing of things,

111
00:05:00,846 --> 00:05:04,020
absolutely just always use the curly
braces until you get the hang of it.

112
00:05:04,020 --> 00:05:04,640
Good question.

113
00:05:04,640 --> 00:05:05,320
>> All right.

114
00:05:05,320 --> 00:05:07,660
So that then was a bug.

115
00:05:07,660 --> 00:05:09,190
At least in something fairly simple.

116
00:05:09,190 --> 00:05:11,260
And yet you might think this
is fairly rudimentary, right?

117
00:05:11,260 --> 00:05:13,635
This is sort of the first week
of looking at the language

118
00:05:13,635 --> 00:05:14,890
like, see your bugs therein.

119
00:05:14,890 --> 00:05:17,250
But the reality these are
actually representative

120
00:05:17,250 --> 00:05:20,310
of some pretty frightening problems
that can arise in the real world.

121
00:05:20,310 --> 00:05:23,530
>> So some of you might recall
if you follow tech news,

122
00:05:23,530 --> 00:05:25,740
or maybe even caught
wind of this in February

123
00:05:25,740 --> 00:05:29,434
of this past year that Apple had
made a bit of a mistake in both iOS,

124
00:05:29,434 --> 00:05:31,350
the operating system on
their phones, and also

125
00:05:31,350 --> 00:05:34,220
Mac OS, the operating system
on their desktops and laptops.

126
00:05:34,220 --> 00:05:36,480
And you saw such headlines as this.

127
00:05:36,480 --> 00:05:41,120
And thereafter, Apple
promised to fix this bug,

128
00:05:41,120 --> 00:05:45,950
and very quickly did fix it in iOS,
but then ultimately fixed it in Mac OS

129
00:05:45,950 --> 00:05:46,810
as well.

130
00:05:46,810 --> 00:05:50,370
>> Now none of these headlines alone really
reveal what the underlying problem was,

131
00:05:50,370 --> 00:05:55,640
but the bug was ultimately reduced to
a bug in SSL, secure sockets layer.

132
00:05:55,640 --> 00:05:57,390
And long story short,
this is the software

133
00:05:57,390 --> 00:06:01,030
that our browsers and other
software used to do what?

134
00:06:01,030 --> 00:06:04,090

135
00:06:04,090 --> 00:06:06,860
>> If I said that SSL is
involved, whenever you

136
00:06:06,860 --> 00:06:13,920
visit a URL that starts with HTTPS,
what then might SSL be related to?

137
00:06:13,920 --> 00:06:14,580
Encryption.

138
00:06:14,580 --> 00:06:16,470
So we'll talk about
this in the coming days.

139
00:06:16,470 --> 00:06:18,750
Encryption, the art of
scrambling information.

140
00:06:18,750 --> 00:06:22,200
>> But long story short, Apple
sometime ago had made a mistake

141
00:06:22,200 --> 00:06:25,970
in their implementation of SSL, the
software that ultimately implements

142
00:06:25,970 --> 00:06:30,120
URLs like HTTPS or max
connections there too.

143
00:06:30,120 --> 00:06:32,850
The result of which is that your
connections could potentially

144
00:06:32,850 --> 00:06:33,920
be intercepted.

145
00:06:33,920 --> 00:06:37,130
And your connections were
not necessarily encrypted

146
00:06:37,130 --> 00:06:40,350
if you had some bad guy in between
you and the destination website who

147
00:06:40,350 --> 00:06:42,170
knew how to take advantage of this.

148
00:06:42,170 --> 00:06:45,090
>> Now Apple ultimately posted
a fix for this finally,

149
00:06:45,090 --> 00:06:46,920
and the description
of their fix was this.

150
00:06:46,920 --> 00:06:49,878
Secure transport failed to validate
the authenticity of the connection.

151
00:06:49,878 --> 00:06:52,920
The issue was addressed by
restoring missing validation steps.

152
00:06:52,920 --> 00:06:57,250
>> So this is a very hand wavy explanation
for simply saying that we screwed up.

153
00:06:57,250 --> 00:07:00,920
There is literally one
line of code that was buggy

154
00:07:00,920 --> 00:07:05,130
in their implementation of SSL, and
if you go online and search for this

155
00:07:05,130 --> 00:07:07,210
you can actually find
the original source code.

156
00:07:07,210 --> 00:07:11,960
For instance, this is a screen shot of
just a portion of a fairly large file,

157
00:07:11,960 --> 00:07:15,965
but this is a function apparently called
SSL verify sign server key exchange.

158
00:07:15,965 --> 00:07:17,840
And it takes a bunch of
arguments and inputs.

159
00:07:17,840 --> 00:07:20,298
And we're not going to focus
too much on the minutia there,

160
00:07:20,298 --> 00:07:24,390
but if you focus on the code inside
of that topmost function-- let's

161
00:07:24,390 --> 00:07:25,590
zoom in on that.

162
00:07:25,590 --> 00:07:28,140
You might already suspect
what the error might

163
00:07:28,140 --> 00:07:31,230
be even if you have no idea
ultimately what you're looking at.

164
00:07:31,230 --> 00:07:35,924
There's kind of an anomaly
here, which is what?

165
00:07:35,924 --> 00:07:38,940
>> Yeah, I don't really like
the look of two goto fails.

166
00:07:38,940 --> 00:07:42,060
Frankly, I don't really know what goto
fail means, but having two of them

167
00:07:42,060 --> 00:07:42,810
back to back.

168
00:07:42,810 --> 00:07:45,290
That just kind of rubs me
intellectually the wrong way,

169
00:07:45,290 --> 00:07:48,910
and indeed if we zoom in on
just those lines, this is C.

170
00:07:48,910 --> 00:07:52,220
>> So a lot of Apple's code
is itself written in C,

171
00:07:52,220 --> 00:07:55,780
and this apparently
is really equivalent--

172
00:07:55,780 --> 00:07:59,060
not to that pretty indentation
version, but if you recognize the fact

173
00:07:59,060 --> 00:08:02,560
that there's no curly braces, what
Apple really wrote was code that looks

174
00:08:02,560 --> 00:08:03,540
like this.

175
00:08:03,540 --> 00:08:07,080
So I've zoomed out and I just
fixed the indentation in the sense

176
00:08:07,080 --> 00:08:10,690
that if there's no curly braces, that
second goto fail that's in yellow

177
00:08:10,690 --> 00:08:12,500
is going to execute no matter what.

178
00:08:12,500 --> 00:08:15,540
It's not associated with
the if condition above it.

179
00:08:15,540 --> 00:08:19,590
>> So even again, if you don't quite
understand what this could possibly

180
00:08:19,590 --> 00:08:23,230
be doing, know that each of these
conditions-- each of these lines

181
00:08:23,230 --> 00:08:26,180
is a very important step
in the process of checking

182
00:08:26,180 --> 00:08:28,350
if your data is in fact encrypted.

183
00:08:28,350 --> 00:08:31,710
So skipping one of these
steps, not the best idea.

184
00:08:31,710 --> 00:08:34,840
>> But because we have this
second goto fail in yellow,

185
00:08:34,840 --> 00:08:36,840
and because once we
sort of aesthetically

186
00:08:36,840 --> 00:08:40,480
move it to the left where it
logically is at the moment, what

187
00:08:40,480 --> 00:08:43,230
does this mean for the line
of code below that second goto

188
00:08:43,230 --> 00:08:46,480
fail would you think?

189
00:08:46,480 --> 00:08:48,860
It's always going to be skipped.

190
00:08:48,860 --> 00:08:52,100
So gotos are generally frowned upon
for reasons we won't really go into,

191
00:08:52,100 --> 00:08:54,940
and indeed in CS50 we tend not
to teach this statement goto,

192
00:08:54,940 --> 00:08:58,130
but you can think of goto
fail as meaning go jump

193
00:08:58,130 --> 00:08:59,600
to some other part of the code.

194
00:08:59,600 --> 00:09:03,120
>> In other words jump over
this last line altogether,

195
00:09:03,120 --> 00:09:07,420
and so the result of this stupid
simple mistake that was just

196
00:09:07,420 --> 00:09:10,330
a result of probably someone
copying and pasting one too

197
00:09:10,330 --> 00:09:14,150
many times was that the entire
security of iOS and Mac OS

198
00:09:14,150 --> 00:09:18,240
was vulnerable to interception
by bad guys for quite some time.

199
00:09:18,240 --> 00:09:19,940
Until Apple finally fixed this.

200
00:09:19,940 --> 00:09:23,100
>> Now if some of you are actually
running old versions of iOS or Mac OS,

201
00:09:23,100 --> 00:09:27,250
you can go to gotofail.com which
is a website that someone set up

202
00:09:27,250 --> 00:09:29,190
to essentially determine
programmatically

203
00:09:29,190 --> 00:09:30,980
if your computer is still vulnerable.

204
00:09:30,980 --> 00:09:33,600
And frankly, if it is,
it's probably a good idea

205
00:09:33,600 --> 00:09:36,870
to update your phone or
your Mac at this point.

206
00:09:36,870 --> 00:09:40,120
But there, just testament to just how
an appreciation of these lower level

207
00:09:40,120 --> 00:09:42,400
details and fairly
simple ideas can really

208
00:09:42,400 --> 00:09:44,590
translate into decisions
and problems that

209
00:09:44,590 --> 00:09:47,320
affected-- in this case--
millions of people.

210
00:09:47,320 --> 00:09:49,107
>> Now a word on administration.

211
00:09:49,107 --> 00:09:50,690
Section will start this coming Sunday.

212
00:09:50,690 --> 00:09:53,360
You will receive an email by the
weekend about section, at which point

213
00:09:53,360 --> 00:09:55,290
the resectioning process
will begin if you've

214
00:09:55,290 --> 00:09:56,998
realized you now have
some new conflicts.

215
00:09:56,998 --> 00:10:00,180
So this happens every year, and we
will accommodate in the days to come.

216
00:10:00,180 --> 00:10:02,430
>> Office hours-- do keep an
eye on this schedule here.

217
00:10:02,430 --> 00:10:05,100
Changes a little bit this week,
particularly the start time

218
00:10:05,100 --> 00:10:08,180
and the location, so do consult
that before heading to office hours

219
00:10:08,180 --> 00:10:09,520
any of the next four nights.

220
00:10:09,520 --> 00:10:12,680
And now a word on assessment,
particularly as you dive into problem

221
00:10:12,680 --> 00:10:14,350
sets one and beyond.

222
00:10:14,350 --> 00:10:17,070
>> So per the specification,
these are generally

223
00:10:17,070 --> 00:10:20,360
the axes along which
we evaluate your work.

224
00:10:20,360 --> 00:10:23,170
Scope refers to what
extent your code implements

225
00:10:23,170 --> 00:10:25,690
the features required
by our specification.

226
00:10:25,690 --> 00:10:28,290
In other words, how much of
a piece set did you bite off.

227
00:10:28,290 --> 00:10:30,440
Did you do a third of it,
a half of it, 100% of it.

228
00:10:30,440 --> 00:10:33,000
Even if it's not correct,
how much did you attempt?

229
00:10:33,000 --> 00:10:35,290
So that captures the level
of effort and the amount

230
00:10:35,290 --> 00:10:38,260
to which you bit off the
problem set's problems.

231
00:10:38,260 --> 00:10:40,690
>> Correctness-- this one, to
what extent, is your code

232
00:10:40,690 --> 00:10:43,150
consistent with our
specifications and free of bugs.

233
00:10:43,150 --> 00:10:44,770
So does it work correctly?

234
00:10:44,770 --> 00:10:48,700
If we give it some input, does it
give us the output that we expect?

235
00:10:48,700 --> 00:10:52,570
Design-- now this is the first of
the particularly qualitative ones,

236
00:10:52,570 --> 00:10:56,180
or the ones that require human judgment.

237
00:10:56,180 --> 00:10:59,690
And indeed, this is why we have a staff
of so many teaching fellows and course

238
00:10:59,690 --> 00:11:00,350
assistants.

239
00:11:00,350 --> 00:11:03,480
To what extent is your
code written well?

240
00:11:03,480 --> 00:11:05,810
>> And again this is a very
qualitative assessment

241
00:11:05,810 --> 00:11:09,100
that will work with you on
bi-directionally in the weeks to come.

242
00:11:09,100 --> 00:11:12,060
So that when you get not
only numeric scores, but also

243
00:11:12,060 --> 00:11:16,682
a written scores, or typed feedback,
or written feedback in English words.

244
00:11:16,682 --> 00:11:19,640
That's what we'll use to drive you
toward actually writing better code.

245
00:11:19,640 --> 00:11:23,320
And in lecture and section, we'll try
to point out-- as often as we can--

246
00:11:23,320 --> 00:11:26,420
what makes a program not only
correct and functionally good,

247
00:11:26,420 --> 00:11:28,200
but also well designed.

248
00:11:28,200 --> 00:11:31,850
The most efficient it could be, or
even the most beautiful it can be.

249
00:11:31,850 --> 00:11:33,100
>> Which leads us to style.

250
00:11:33,100 --> 00:11:36,876
Style ultimately is
an aesthetic judgment.

251
00:11:36,876 --> 00:11:38,750
Did you choose good
names for your variables?

252
00:11:38,750 --> 00:11:40,330
Have you indented your code properly?

253
00:11:40,330 --> 00:11:44,010
Does it look good, and therefore,
is it easy for another human being

254
00:11:44,010 --> 00:11:46,550
to read your respective
of its correctness.

255
00:11:46,550 --> 00:11:50,300
>> Now generally per the syllabus, we score
these things on a five point scale.

256
00:11:50,300 --> 00:11:53,640
And let me hammer home the point
that a three is indeed good.

257
00:11:53,640 --> 00:11:55,550
Very quickly do folks
start doing arithmetic.

258
00:11:55,550 --> 00:11:58,133
When they get a three out of
five on correctness for some pset

259
00:11:58,133 --> 00:12:02,040
and they think damn, I going to 60%
which is essentially a D or an E.

260
00:12:02,040 --> 00:12:03,980
>> That's not the way we
think of these numbers.

261
00:12:03,980 --> 00:12:06,880
A three is indeed good, and what we
generally expect at the beginning

262
00:12:06,880 --> 00:12:09,820
of the term is that if you're getting
a bunch of three's-- maybe a couple

263
00:12:09,820 --> 00:12:12,540
of fairs, a couple of fours-- or
a couple twos, a couple of fours--

264
00:12:12,540 --> 00:12:13,748
that's a good place to start.

265
00:12:13,748 --> 00:12:16,320
And so long as we see an
upward trajectory over time,

266
00:12:16,320 --> 00:12:18,540
you're in a particularly good place.

267
00:12:18,540 --> 00:12:20,752
>> The formula we use to
weight things is essentially

268
00:12:20,752 --> 00:12:22,710
this per the syllabus,
which just means that we

269
00:12:22,710 --> 00:12:24,750
give more weight to correctness.

270
00:12:24,750 --> 00:12:27,930
Because it's very often correctness
that takes the most time.

271
00:12:27,930 --> 00:12:28,760
Trust me now.

272
00:12:28,760 --> 00:12:31,190
You will find-- at least
in one pset-- that you

273
00:12:31,190 --> 00:12:36,790
spend 90% of your time
working on 10% of the problem.

274
00:12:36,790 --> 00:12:39,320
>> And everything sort of works
except for one or two bugs,

275
00:12:39,320 --> 00:12:41,570
and those are the bugs that
keep you up late at night.

276
00:12:41,570 --> 00:12:43,380
Those are the ones that
sort of escape you.

277
00:12:43,380 --> 00:12:45,560
But after sleeping on it,
or attending office hours

278
00:12:45,560 --> 00:12:48,844
or asking questions online, is
when you get to that 100% goal,

279
00:12:48,844 --> 00:12:50,760
and that's why we weight
correctness the most.

280
00:12:50,760 --> 00:12:54,102
Design a little less, and
style a little less than that.

281
00:12:54,102 --> 00:12:56,060
But keep in mind-- style
is perhaps the easiest

282
00:12:56,060 --> 00:12:58,890
of these to bite off
as per the style guide.

283
00:12:58,890 --> 00:13:01,580
>> And now, a more serious
note on academic honesty.

284
00:13:01,580 --> 00:13:05,000
CS50 has the unfortunate distinction of
being the largest producer of Ad Board

285
00:13:05,000 --> 00:13:07,330
cases almost every year historically.

286
00:13:07,330 --> 00:13:11,012
This is not because students cheat in
CS50 any more so than any other class,

287
00:13:11,012 --> 00:13:13,720
but because by nature of the work,
the fact that it's electronic,

288
00:13:13,720 --> 00:13:16,636
the fact that we look for it, and
the fact we are computer scientists,

289
00:13:16,636 --> 00:13:20,570
I can say we are unfortunately
very good at detecting it.

290
00:13:20,570 --> 00:13:22,710
>> So what does this mean in real terms?

291
00:13:22,710 --> 00:13:24,820
So it, per the syllabus,
the course's philosophy

292
00:13:24,820 --> 00:13:28,090
really does boil down to be reasonable.

293
00:13:28,090 --> 00:13:31,684
There is this line between
doing one's work on your own

294
00:13:31,684 --> 00:13:34,100
and getting a little bit of
reasonable help from a friend,

295
00:13:34,100 --> 00:13:38,020
and outright doing that work for your
friend, or sending him or her your code

296
00:13:38,020 --> 00:13:41,080
so that he or she can simply
take or borrow it out right.

297
00:13:41,080 --> 00:13:43,580
And that crosses the line
that we drawn in the class.

298
00:13:43,580 --> 00:13:45,410
>> See, the syllabus
ultimately for the lines

299
00:13:45,410 --> 00:13:48,209
that we draw as being reasonable
and unreasonable behavior,

300
00:13:48,209 --> 00:13:50,000
but it really does boil
down to the essence

301
00:13:50,000 --> 00:13:53,980
of your work needing to
be your own in the end.

302
00:13:53,980 --> 00:13:56,230
Now with that said,
there is a heuristic.

303
00:13:56,230 --> 00:13:58,980
Because as you might imagine--
from office hours and the visuals

304
00:13:58,980 --> 00:14:01,060
and the videos we've
shown thus far-- CS50

305
00:14:01,060 --> 00:14:04,530
is indeed meant to be as collaborative
and as cooperative and as social

306
00:14:04,530 --> 00:14:06,450
as possible.

307
00:14:06,450 --> 00:14:08,570
As collaborative as it is rigorous.

308
00:14:08,570 --> 00:14:11,314
>> But with this said, the heuristic,
as you'll see in the syllabus,

309
00:14:11,314 --> 00:14:12,980
is that when you're having some problem.

310
00:14:12,980 --> 00:14:16,470
You have some bug in your code that you
can't solve, it is reasonable for you

311
00:14:16,470 --> 00:14:18,039
to show your code to someone else.

312
00:14:18,039 --> 00:14:21,080
A friend even in the class, a friend
sitting next to you at office hours,

313
00:14:21,080 --> 00:14:22,680
or a member of the staff.

314
00:14:22,680 --> 00:14:25,810
But they may not show their code to you.

315
00:14:25,810 --> 00:14:27,710
>> In other words, an
answer to your question--

316
00:14:27,710 --> 00:14:29,940
I need help-- is not oh, here's my code.

317
00:14:29,940 --> 00:14:32,440
Take a look at this and
deduce from it what you will.

318
00:14:32,440 --> 00:14:34,580
Now, of course, there's
a way clearly to game

319
00:14:34,580 --> 00:14:37,760
this system whereby I'll show you
my code before having a question.

320
00:14:37,760 --> 00:14:40,150
You show me my your code
before having a question.

321
00:14:40,150 --> 00:14:45,870
But see the syllabus again for the
finer details of where this line is.

322
00:14:45,870 --> 00:14:50,606
>> Just to now paint the picture and
share as transparently as possible

323
00:14:50,606 --> 00:14:53,480
where we are at in recent years,
this is the number of Ad Board cases

324
00:14:53,480 --> 00:14:56,260
that CS50 has had over
the past seven years.

325
00:14:56,260 --> 00:14:58,717
With 14 cases this most recent fall.

326
00:14:58,717 --> 00:15:01,300
In terms of the students involved,
it was 20 some odd students

327
00:15:01,300 --> 00:15:02,490
this past fall.

328
00:15:02,490 --> 00:15:05,670
There was a peak of 33
students some years ago.

329
00:15:05,670 --> 00:15:08,830
Many of whom are unfortunately
no longer here on campus.

330
00:15:08,830 --> 00:15:13,100
>> Students involved as a percentage of the
class has historically ranged from 0%

331
00:15:13,100 --> 00:15:17,300
to 5.3%, which is only to say
this is annually a challenge.

332
00:15:17,300 --> 00:15:20,390
And toward that end, what
we want to do is convey one

333
00:15:20,390 --> 00:15:24,310
that we dd-- just FYI-- compare at
a fairness to those students who

334
00:15:24,310 --> 00:15:26,520
are following the line accordingly.

335
00:15:26,520 --> 00:15:29,620
We do compare all current
submissions against all past missions

336
00:15:29,620 --> 00:15:30,840
from the past many years.

337
00:15:30,840 --> 00:15:33,620
>> We know too how to Google around
and find code repositories

338
00:15:33,620 --> 00:15:36,360
online, discussion forums
online, job sites online.

339
00:15:36,360 --> 00:15:41,580
If a student can find it, we can surely
find it as much as we regretfully do.

340
00:15:41,580 --> 00:15:45,330
So what you'll see in the syllabus
though is this regret clause.

341
00:15:45,330 --> 00:15:47,500
I can certainly
appreciate, and we all has

342
00:15:47,500 --> 00:15:50,870
staff having done the course like
this, or this one itself over time,

343
00:15:50,870 --> 00:15:53,997
certainly know what it's like when
life gets in the way when you have

344
00:15:53,997 --> 00:15:56,080
some late night deadline--
not only in this class,

345
00:15:56,080 --> 00:15:58,660
but another-- when you're
completely exhausted, stressed out,

346
00:15:58,660 --> 00:16:00,659
have an inordinate number
of other things to do.

347
00:16:00,659 --> 00:16:03,660
You will make at some point in
life certainly a bad, perhaps late

348
00:16:03,660 --> 00:16:04,620
night decision.

349
00:16:04,620 --> 00:16:06,520
>> So per the syllabus,
there is this clause,

350
00:16:06,520 --> 00:16:10,629
such that if within 72 hours of making
some poor decision, you own up to it

351
00:16:10,629 --> 00:16:12,670
and reach out to me and
one of the course's heads

352
00:16:12,670 --> 00:16:14,300
and we will have a conversation.

353
00:16:14,300 --> 00:16:16,220
We will handle things
internally in hopes

354
00:16:16,220 --> 00:16:18,770
of it becoming more of a
teaching moment or life lesson,

355
00:16:18,770 --> 00:16:22,120
and not something with
particularly drastic ramifications

356
00:16:22,120 --> 00:16:24,570
as you might see on these charts here.

357
00:16:24,570 --> 00:16:26,540
>> So that's a very serious tone.

358
00:16:26,540 --> 00:16:29,960
Let us pause for just a few
seconds to break the tension.

359
00:16:29,960 --> 00:16:34,442
>> [MUSIC PLAYING]

360
00:16:34,442 --> 00:17:17,768

361
00:17:17,768 --> 00:17:20,250
>> DAVID J. MALAN: All right,
so how was that for a segue?

362
00:17:20,250 --> 00:17:22,059
To today's primary topics.

363
00:17:22,059 --> 00:17:23,859
The first of which is abstraction.

364
00:17:23,859 --> 00:17:26,900
Another of which is going to be the
representation of data, which frankly

365
00:17:26,900 --> 00:17:31,640
is a really dry way of saying how can we
go about solving problems and thinking

366
00:17:31,640 --> 00:17:33,250
about solving problems?

367
00:17:33,250 --> 00:17:37,285
So you've seen in Scratch, and you've
seen perhaps already in pset1 with C

368
00:17:37,285 --> 00:17:39,930
that you not only can use
functions, like printf,

369
00:17:39,930 --> 00:17:42,770
that other people in
years past wrote for you.

370
00:17:42,770 --> 00:17:45,340
You can also write your own functions.

371
00:17:45,340 --> 00:17:48,440
>> And even though you might not have
done this in C, and frankly in pset1

372
00:17:48,440 --> 00:17:51,866
you don't really need to write your
own function because the problem--

373
00:17:51,866 --> 00:17:53,990
while perhaps daunting at
first glance-- you'll see

374
00:17:53,990 --> 00:17:57,910
can ultimately be solved with
not all that many lines of code.

375
00:17:57,910 --> 00:18:01,140
But with that said, in terms
of writing your own function,

376
00:18:01,140 --> 00:18:03,570
realize that C does give
you this capability.

377
00:18:03,570 --> 00:18:06,940
>> I'm going to go in today's source code,
which is available already online,

378
00:18:06,940 --> 00:18:10,900
and I'm going to go ahead and open
up a program called function 0.C,

379
00:18:10,900 --> 00:18:14,620
and in function zero
we'll see a few things.

380
00:18:14,620 --> 00:18:19,160
In first lines 18 through
23 is my main function.

381
00:18:19,160 --> 00:18:22,414
And now that we're beginning to read
code that we're not writing on the fly,

382
00:18:22,414 --> 00:18:25,080
but instead I've written in advance
or that you in a problem set

383
00:18:25,080 --> 00:18:27,910
might receive having
been written in advance.

384
00:18:27,910 --> 00:18:30,040
A good way to start
reading someone else's code

385
00:18:30,040 --> 00:18:31,400
is look for the main function.

386
00:18:31,400 --> 00:18:34,420
Figure out where that entry
point is to running the program,

387
00:18:34,420 --> 00:18:36,580
and then follow it logically from there.

388
00:18:36,580 --> 00:18:40,190
>> So this program apparently prints
your name followed by a colon.

389
00:18:40,190 --> 00:18:42,490
We then use GetString
from the CS50 library

390
00:18:42,490 --> 00:18:46,050
to get a string, or a word or phrase
from the user at the keyboard.

391
00:18:46,050 --> 00:18:48,390
And then there's this
thing here-- PrintName.

392
00:18:48,390 --> 00:18:51,420
>> Now PrintName is not a
function that comes with C.

393
00:18:51,420 --> 00:18:52,970
It's not in standard io.h.

394
00:18:52,970 --> 00:18:55,570
It's not in CS50.h.

395
00:18:55,570 --> 00:18:57,880
It's rather in the same file.

396
00:18:57,880 --> 00:19:01,000
Notice if I scroll down
a bit-- lines 25 to 27--

397
00:19:01,000 --> 00:19:05,330
it's just a pretty way of commenting
your code using the stars and slashes.

398
00:19:05,330 --> 00:19:07,320
This is a multi-line
comment, and this is just

399
00:19:07,320 --> 00:19:10,570
my description in blue of
what this function does.

400
00:19:10,570 --> 00:19:14,530
>> Because in lines 28 through 31,
I've written a super simple function

401
00:19:14,530 --> 00:19:16,280
whose name is PrintName.

402
00:19:16,280 --> 00:19:19,560
It takes how many
arguments would you say?

403
00:19:19,560 --> 00:19:25,120
So one argument-- because there's one
argument listed inside the parentheses.

404
00:19:25,120 --> 00:19:27,000
The type of which is String.

405
00:19:27,000 --> 00:19:30,240
Which is to say PrintName
is like this black box

406
00:19:30,240 --> 00:19:32,910
or function that takes
as input a string.

407
00:19:32,910 --> 00:19:35,730
>> And the name of that String
conveniently will be Name.

408
00:19:35,730 --> 00:19:37,840
Not S, not N, but Name.

409
00:19:37,840 --> 00:19:41,090
So what does PrintName do?

410
00:19:41,090 --> 00:19:42,210
It's nice simple.

411
00:19:42,210 --> 00:19:45,390
Just as one line of code for
the printf, but apparently it

412
00:19:45,390 --> 00:19:47,950
prints out "Hello," so and so.

413
00:19:47,950 --> 00:19:50,070
Where the so and so
comes from the argument.

414
00:19:50,070 --> 00:19:52,300
>> Now this is not a huge innovation here.

415
00:19:52,300 --> 00:19:56,710
Really, I've taken a program that could
have been written with one line of code

416
00:19:56,710 --> 00:20:00,190
by putting this up here,
and changed it to something

417
00:20:00,190 --> 00:20:04,920
that involves some six or seven or so
lines of code all the way down here.

418
00:20:04,920 --> 00:20:08,190
>> But it's the practicing of a
principle known as abstraction.

419
00:20:08,190 --> 00:20:12,550
Kind of encapsulating inside of a new
function that has a name, and better

420
00:20:12,550 --> 00:20:14,590
yet that name literally
says what it does.

421
00:20:14,590 --> 00:20:16,880
I mean printf-- that's not
particularly descriptive.

422
00:20:16,880 --> 00:20:18,932
If I want to create a
puzzle piece, or if I

423
00:20:18,932 --> 00:20:21,140
want to create a function
that prints someone's name,

424
00:20:21,140 --> 00:20:23,230
the beauty of doing this
is that I can actually

425
00:20:23,230 --> 00:20:27,170
give that function a name
that describes what it does.

426
00:20:27,170 --> 00:20:29,844
>> Now it takes in an input that
I've arbitrarily called name,

427
00:20:29,844 --> 00:20:32,760
but that too is wonderfully descriptive
instead of being a little more

428
00:20:32,760 --> 00:20:36,140
generic like S. And
void, for now, just means

429
00:20:36,140 --> 00:20:38,330
that this function doesn't
hand me back anything.

430
00:20:38,330 --> 00:20:41,127
It's not like GetString that
literally hands me back a string

431
00:20:41,127 --> 00:20:43,960
like we did with the pieces of paper
with your classmates last week,

432
00:20:43,960 --> 00:20:45,990
but rather it just has a side effect.

433
00:20:45,990 --> 00:20:48,080
It prints something to the screen.

434
00:20:48,080 --> 00:20:53,880
>> So at the end of the day, if I
do make function-0, ./function-0,

435
00:20:53,880 --> 00:20:55,450
we'll see that it asks for my name.

436
00:20:55,450 --> 00:20:58,150
I type David, and it types out my name.

437
00:20:58,150 --> 00:21:01,080
If I do it again with Rob,
it's going to say "Hello, Rob."

438
00:21:01,080 --> 00:21:04,280
So a simple idea, but perhaps
extrapolate from this mentally

439
00:21:04,280 --> 00:21:06,750
that as your programs get
a little more complicated,

440
00:21:06,750 --> 00:21:10,290
and you want to write a chunk of
code and call that code-- invoke

441
00:21:10,290 --> 00:21:13,270
that code-- by some descriptive
name like PrintName,

442
00:21:13,270 --> 00:21:15,600
C does afford us this capability.

443
00:21:15,600 --> 00:21:17,660
>> Here's another simple example.

444
00:21:17,660 --> 00:21:22,940
For instance, if I open up a
file from today called return.c,

445
00:21:22,940 --> 00:21:24,270
notice what I've done here.

446
00:21:24,270 --> 00:21:26,330
Most of this main function is printf.

447
00:21:26,330 --> 00:21:30,360
I first arbitrarily initialize a
variable called x to the number 2.

448
00:21:30,360 --> 00:21:34,110
I then print out "x is now
%i" passing in the value of x.

449
00:21:34,110 --> 00:21:35,500
So I'm just saying what it is.

450
00:21:35,500 --> 00:21:37,208
>> Now I'm just boldly
claiming with printf.

451
00:21:37,208 --> 00:21:42,050
I am cubing that value x, and I'm
doing so by calling a function

452
00:21:42,050 --> 00:21:45,590
called cube passing
in x as the argument,

453
00:21:45,590 --> 00:21:49,300
and then saving the output
in the variable itself, x.

454
00:21:49,300 --> 00:21:51,340
So I'm clobbering the value of x.

455
00:21:51,340 --> 00:21:53,380
I'm overriding the
value of x with whatever

456
00:21:53,380 --> 00:21:56,510
the result of calling
this cube function is.

457
00:21:56,510 --> 00:21:59,530
And then I just print out some
fluffy stuff here saying what I did.

458
00:21:59,530 --> 00:22:01,600
>> So what then is cube?

459
00:22:01,600 --> 00:22:03,510
Notice what's fundamentally
different here.

460
00:22:03,510 --> 00:22:05,540
I've given the function
a name as before.

461
00:22:05,540 --> 00:22:08,270
I've specified a name for an argument.

462
00:22:08,270 --> 00:22:11,650
This time it's called n instead of name,
but I could call it anything I want.

463
00:22:11,650 --> 00:22:12,650
But this is different.

464
00:22:12,650 --> 00:22:14,080
This thing on the left.

465
00:22:14,080 --> 00:22:16,290
Previously it was what keyword?

466
00:22:16,290 --> 00:22:16,870
Boys.

467
00:22:16,870 --> 00:22:18,580
Now it's obviously int.

468
00:22:18,580 --> 00:22:20,630
>> So what's perhaps the take away?

469
00:22:20,630 --> 00:22:24,090
Whereas void signifies sort of
nothingness, and that was the case.

470
00:22:24,090 --> 00:22:25,970
PrintName returned nothing.

471
00:22:25,970 --> 00:22:27,942
It did something, but
it didn't hand me back

472
00:22:27,942 --> 00:22:30,650
something that I could put on the
left hand side of an equal sign

473
00:22:30,650 --> 00:22:32,460
like I've done here on line 22.

474
00:22:32,460 --> 00:22:36,780
>> So if I say into on line 30,
what's that probably implying

475
00:22:36,780 --> 00:22:38,610
about what cube does for me?

476
00:22:38,610 --> 00:22:41,110
Yeah?

477
00:22:41,110 --> 00:22:42,310
It returns an integer.

478
00:22:42,310 --> 00:22:44,590
So it hands me back, for
instance, a piece of paper

479
00:22:44,590 --> 00:22:46,580
on which it has written the answer.

480
00:22:46,580 --> 00:22:50,130
2 cubed, or 3 cubed, or 4
cubed-- whatever I passed in,

481
00:22:50,130 --> 00:22:51,540
and how did I implement this?

482
00:22:51,540 --> 00:22:54,810
Well, just n times n times n
is how I might cube a value.

483
00:22:54,810 --> 00:22:57,110
So again, super simple
idea, but demonstrative

484
00:22:57,110 --> 00:23:00,100
now how we can write functions
that actually had us back

485
00:23:00,100 --> 00:23:02,380
values that might be of interest.

486
00:23:02,380 --> 00:23:05,740
>> Let's look at one last example
here called function one.

487
00:23:05,740 --> 00:23:08,530
In this example, it starts
to get more compelling.

488
00:23:08,530 --> 00:23:12,400
So in function one, this
program-- notice ultimately

489
00:23:12,400 --> 00:23:14,920
calls a function called GetPositiveInt.

490
00:23:14,920 --> 00:23:17,800
GetPositiveInt is not a
function in the CS50 library,

491
00:23:17,800 --> 00:23:20,400
but we decided we
would like it to exist.

492
00:23:20,400 --> 00:23:24,550
>> So if we scroll down later in the file,
notice how I went about implementing

493
00:23:24,550 --> 00:23:26,560
get positive int, and I
say it's more compelling

494
00:23:26,560 --> 00:23:28,992
because this is a decent
number of lines of code.

495
00:23:28,992 --> 00:23:30,700
It's not just a silly
little toy program.

496
00:23:30,700 --> 00:23:33,870
It's actually got some error checking
and doing something more useful.

497
00:23:33,870 --> 00:23:38,470
>> So if you've not seen the walkthrough
videos that we have embedded in pset1,

498
00:23:38,470 --> 00:23:42,350
know that this is a type of
loop in C, similar in spirit

499
00:23:42,350 --> 00:23:44,270
to the kinds of things Scratch can do.

500
00:23:44,270 --> 00:23:46,320
And do says do this.

501
00:23:46,320 --> 00:23:47,500
Print this out.

502
00:23:47,500 --> 00:23:51,860
Then go ahead and get n--
get an int and store it in n,

503
00:23:51,860 --> 00:23:55,760
and keep doing this again and again and
again so long as n is less than one.

504
00:23:55,760 --> 00:23:58,720
>> So n is going to be less than one
only if the human's not cooperating.

505
00:23:58,720 --> 00:24:01,980
If he or she is typing
in 0 or -1 or -50,

506
00:24:01,980 --> 00:24:04,790
this loop is going to keep
executing again and again.

507
00:24:04,790 --> 00:24:07,549
And ultimately notice, I
simply return the value.

508
00:24:07,549 --> 00:24:09,590
So now we have a function
that would've been nice

509
00:24:09,590 --> 00:24:14,040
if CS50 would implement in
CS50.h and CS50.c for you,

510
00:24:14,040 --> 00:24:16,520
but here we can now
implement this ourselves.

511
00:24:16,520 --> 00:24:19,230
>> But two comments on some key details.

512
00:24:19,230 --> 00:24:24,390
One-- why did I declare int
n, do you think, on line 29

513
00:24:24,390 --> 00:24:27,139
instead of just doing
this here, which is

514
00:24:27,139 --> 00:24:28,930
more consistent with
what we did last week?

515
00:24:28,930 --> 00:24:29,430
Yeah?

516
00:24:29,430 --> 00:24:34,485

517
00:24:34,485 --> 00:24:35,110
A good thought.

518
00:24:35,110 --> 00:24:37,080
So if we were to put it
here, it's as though we

519
00:24:37,080 --> 00:24:39,110
keep declaring it again and again.

520
00:24:39,110 --> 00:24:42,000
That in and of itself is
not problematic, per se,

521
00:24:42,000 --> 00:24:43,940
because we only need
the value once and then

522
00:24:43,940 --> 00:24:45,330
we're going to get a new one anyway.

523
00:24:45,330 --> 00:24:45,940
But a good thought.

524
00:24:45,940 --> 00:24:46,440
Yeah?

525
00:24:46,440 --> 00:24:52,770

526
00:24:52,770 --> 00:24:53,330
>> Close.

527
00:24:53,330 --> 00:24:59,030
So because I've declared n on
line 29 outside of the loop,

528
00:24:59,030 --> 00:25:01,390
it's accessible throughout
this entire function.

529
00:25:01,390 --> 00:25:05,400
Not the other functions because
n is still inside of these curly

530
00:25:05,400 --> 00:25:06,470
braces here.

531
00:25:06,470 --> 00:25:07,940
So-- sure.

532
00:25:07,940 --> 00:25:12,430

533
00:25:12,430 --> 00:25:12,940
>> Exactly.

534
00:25:12,940 --> 00:25:14,356
So this is even more to the point.

535
00:25:14,356 --> 00:25:18,600
If we instead declared
n right here on line 32,

536
00:25:18,600 --> 00:25:22,340
it's problematic because guess
where else I need to access it?

537
00:25:22,340 --> 00:25:25,620
On line 34, and the
simple rule of thumb is

538
00:25:25,620 --> 00:25:30,060
that you can only use a variable
inside of the most recent curly braces

539
00:25:30,060 --> 00:25:31,420
in which you declared it.

540
00:25:31,420 --> 00:25:35,230
>> Unfortunately, line 34
is one line too late,

541
00:25:35,230 --> 00:25:38,560
because I've already closed
the curly brace on line 33

542
00:25:38,560 --> 00:25:41,220
that corresponds to the
curly brace on line 30.

543
00:25:41,220 --> 00:25:44,180
And so this is a way of saying
that this variable int is scoped,

544
00:25:44,180 --> 00:25:46,970
so to speak, to only inside
of those curly braces.

545
00:25:46,970 --> 00:25:48,910
It just doesn't exist outside of them.

546
00:25:48,910 --> 00:25:51,580
>> So indeed, if I do this
wrong, let me save the code

547
00:25:51,580 --> 00:25:53,530
as it is-- incorrectly written.

548
00:25:53,530 --> 00:25:57,990
Let me go ahead and do make
function-1, and notice-- error.

549
00:25:57,990 --> 00:26:03,502
Use of undeclared identifier n
on line 35, which is right here.

550
00:26:03,502 --> 00:26:05,210
And if we scroll up
further, another one.

551
00:26:05,210 --> 00:26:08,750
Use of undeclared
identifier n on line 34.

552
00:26:08,750 --> 00:26:11,200
>> So the compiler, Clang,
is noticing that it just

553
00:26:11,200 --> 00:26:13,720
doesn't exist even though
clearly it's there visually.

554
00:26:13,720 --> 00:26:16,090
So a simple fix is declaring it there.

555
00:26:16,090 --> 00:26:18,790
>> Now let me scroll to
the top of the file.

556
00:26:18,790 --> 00:26:21,080
What jumps out at you as
being a little different

557
00:26:21,080 --> 00:26:23,070
from the stuff we looked at last week?

558
00:26:23,070 --> 00:26:26,990
Not only do I have name, not only do
I have some sharp includes up top,

559
00:26:26,990 --> 00:26:29,340
I have something I'm
calling a prototype.

560
00:26:29,340 --> 00:26:36,100
Now that looks awfully similar to what
we just saw a moment ago on line 27.

561
00:26:36,100 --> 00:26:39,230
>> So let's infer from a different
error message why I've done this.

562
00:26:39,230 --> 00:26:42,050
Let me go ahead and
delete these lines there.

563
00:26:42,050 --> 00:26:44,240
And so we know nothing about prototype.

564
00:26:44,240 --> 00:26:45,430
Remake this file.

565
00:26:45,430 --> 00:26:46,890
Make function one.

566
00:26:46,890 --> 00:26:48,090
And now, damn, four errors.

567
00:26:48,090 --> 00:26:50,220
Let's scroll up to the first one.

568
00:26:50,220 --> 00:26:55,070
>> Implicit declaration of function
get positive int is invalid in C99.

569
00:26:55,070 --> 00:26:57,780
C99 just means the 1999
version of the language

570
00:26:57,780 --> 00:26:59,710
C, which is what we're indeed using.

571
00:26:59,710 --> 00:27:01,050
So what does this mean?

572
00:27:01,050 --> 00:27:05,250
Well C-- and more specifically C
compilers-- are pretty dumb programs.

573
00:27:05,250 --> 00:27:07,420
They only know what you've
told them, and that's

574
00:27:07,420 --> 00:27:08,960
actually thematic from last week.

575
00:27:08,960 --> 00:27:12,910
>> The problem is that if I go
about implementing name up here,

576
00:27:12,910 --> 00:27:17,640
and I call a function called
GetPositiveInt here on line 20,

577
00:27:17,640 --> 00:27:22,520
that function technically doesn't
exist until the compiler sees line 27.

578
00:27:22,520 --> 00:27:25,450
Unfortunately, the compiler is
doing things top, down, left, right,

579
00:27:25,450 --> 00:27:29,580
so because it has not seen the
implementation of GetPositiveInt,

580
00:27:29,580 --> 00:27:32,400
but it sees you trying
to use it up here,

581
00:27:32,400 --> 00:27:35,810
it's just going to bail-- yell at
you with an error message-- perhaps

582
00:27:35,810 --> 00:27:38,440
cryptic, and not actually
compile the file.

583
00:27:38,440 --> 00:27:41,940
>> So a so-called prototype up
here is admittedly redundant.

584
00:27:41,940 --> 00:27:47,870
Literally, I went down here and I copied
and pasted this, and I put it up here.

585
00:27:47,870 --> 00:27:51,020
Void would be more proper, so we'll
literally copy and paste it this time.

586
00:27:51,020 --> 00:27:52,854
I literally copied and pasted it.

587
00:27:52,854 --> 00:27:54,270
Really just as like a bread crumb.

588
00:27:54,270 --> 00:27:56,260
>> A little clue to the compiler.

589
00:27:56,260 --> 00:27:58,860
I don't know what this does
yet, but I'm promising to you

590
00:27:58,860 --> 00:28:00,260
that it will exist eventually.

591
00:28:00,260 --> 00:28:04,010
And that's why this line-- in
line 16-- ends with a semicolon.

592
00:28:04,010 --> 00:28:05,486
It is redundant by design.

593
00:28:05,486 --> 00:28:05,986
Yes?

594
00:28:05,986 --> 00:28:11,340

595
00:28:11,340 --> 00:28:14,360
>> If you didn't link your library
to the-- oh, good question.

596
00:28:14,360 --> 00:28:17,350
Sharp includes header file inclusions.

597
00:28:17,350 --> 00:28:20,040
Need to be-- should almost
always be at the very top

598
00:28:20,040 --> 00:28:23,270
of the file for a similar-- for
exactly the same reason, yes.

599
00:28:23,270 --> 00:28:26,430
Because in standard
io.h is literally a line

600
00:28:26,430 --> 00:28:30,560
like this, but with the word printf, and
with its arguments and its return type.

601
00:28:30,560 --> 00:28:33,310
And so by doing sharp include up
here, what you're literally doing

602
00:28:33,310 --> 00:28:36,380
is copying and pasting the contents
of someone else wrote up top.

603
00:28:36,380 --> 00:28:39,660
Thereby cluing your code in to the
fact that those functions do exist.

604
00:28:39,660 --> 00:28:40,160
Yeah?

605
00:28:40,160 --> 00:28:47,520

606
00:28:47,520 --> 00:28:48,260
>> Absolutely.

607
00:28:48,260 --> 00:28:51,690
So a very clever and correct
solution would be, you know what?

608
00:28:51,690 --> 00:28:53,760
I don't know what a
prototype is, but I know

609
00:28:53,760 --> 00:28:56,390
if I understand that C is just
dumb and rethinks top to bottom.

610
00:28:56,390 --> 00:28:57,820
Well let's give it what it wants.

611
00:28:57,820 --> 00:29:01,650
Let's cut that code, paste it up
top, and now push main down below.

612
00:29:01,650 --> 00:29:03,470
This too would solve the problem.

613
00:29:03,470 --> 00:29:07,409
>> But you could very easily come up with
a scenario in which A need to call B,

614
00:29:07,409 --> 00:29:10,075
and maybe B calls back to A. This
is something called recursion,

615
00:29:10,075 --> 00:29:11,370
and we'll come back to that.

616
00:29:11,370 --> 00:29:13,911
And it may or may not be a good
thing, but you can definitely

617
00:29:13,911 --> 00:29:15,110
break this solution.

618
00:29:15,110 --> 00:29:17,690
>> And moreover, I would
claim stylistically,

619
00:29:17,690 --> 00:29:20,760
especially when your programs
become this long and this long,

620
00:29:20,760 --> 00:29:23,064
it's just super convenient
to put main at the top

621
00:29:23,064 --> 00:29:25,730
because it's the thing most
programmers are going to care about.

622
00:29:25,730 --> 00:29:28,150
And so it's a little cleaner,
arguably, to do it the way

623
00:29:28,150 --> 00:29:30,380
I originally did it
with a prototype even

624
00:29:30,380 --> 00:29:33,396
though it looks a little
redundant at first glance.

625
00:29:33,396 --> 00:29:33,895
Yeah?

626
00:29:33,895 --> 00:29:36,472

627
00:29:36,472 --> 00:29:37,680
Sorry, can you say it louder?

628
00:29:37,680 --> 00:29:45,650

629
00:29:45,650 --> 00:29:49,580
>> If you switch the locations of the
implementation and the prototype?

630
00:29:49,580 --> 00:29:51,270
So that's a good question.

631
00:29:51,270 --> 00:29:53,780
If you re-declare this down
here, let's see what happens.

632
00:29:53,780 --> 00:29:55,530
So if I put this down
here, you're saying.

633
00:29:55,530 --> 00:29:57,860

634
00:29:57,860 --> 00:29:58,360
Oh, sorry.

635
00:29:58,360 --> 00:29:58,859
Louder?

636
00:29:58,859 --> 00:30:02,000

637
00:30:02,000 --> 00:30:04,011
Even louder.

638
00:30:04,011 --> 00:30:04,760
Oh, good question.

639
00:30:04,760 --> 00:30:05,860
Would it invalidate the function?

640
00:30:05,860 --> 00:30:08,901
You know, after all these years, I
have never put a prototype afterwards.

641
00:30:08,901 --> 00:30:13,810
So let's do make function-1
after doing that.

642
00:30:13,810 --> 00:30:15,279
>> [MUTTERING]

643
00:30:15,279 --> 00:30:16,320
DAVID J. MALAN: Oh, wait.

644
00:30:16,320 --> 00:30:17,944
We still have to put everything up top.

645
00:30:17,944 --> 00:30:21,400
So let's do this up here, if I'm
understanding your question correctly.

646
00:30:21,400 --> 00:30:24,700
I'm putting everything, including
the prototype above main,

647
00:30:24,700 --> 00:30:28,180
but I'm putting the prototype
below the implementation.

648
00:30:28,180 --> 00:30:33,190
>> So if I make one, I'm getting
back an error-- unused variable n.

649
00:30:33,190 --> 00:30:37,280

650
00:30:37,280 --> 00:30:37,860
Oh, there.

651
00:30:37,860 --> 00:30:38,360
Thank you.

652
00:30:38,360 --> 00:30:39,430
Let's see, we get rid of this.

653
00:30:39,430 --> 00:30:41,304
That's a different bug,
so let's ignore that.

654
00:30:41,304 --> 00:30:43,910
Let's really quickly remake this.

655
00:30:43,910 --> 00:30:48,100
>> OK, so data argument not
used by format String

656
00:30:48,100 --> 00:30:52,310
n-- oh, that's because
I changed to these here.

657
00:30:52,310 --> 00:30:55,885
All right, we know what the answer
is going to-- all right, here we go.

658
00:30:55,885 --> 00:31:00,560
Ah, thanks for the positive.

659
00:31:00,560 --> 00:31:03,430
All right, I will fix this code
after-- ignore this particular bug

660
00:31:03,430 --> 00:31:08,300
since this was-- it works is the answer.

661
00:31:08,300 --> 00:31:11,560
>> So it doesn't overwrite
what you've just done.

662
00:31:11,560 --> 00:31:14,800
I suspect the compiler
is written in such a way

663
00:31:14,800 --> 00:31:18,420
that it is ignoring your prototype
because the body, so to speak,

664
00:31:18,420 --> 00:31:20,922
of the function has already
been implemented higher up.

665
00:31:20,922 --> 00:31:23,380
I would have to actually consult
the manual of the compiler

666
00:31:23,380 --> 00:31:26,171
to understand if there's any other
implication, but at first glance

667
00:31:26,171 --> 00:31:29,290
just by trying and experimenting,
there seems to be no impact.

668
00:31:29,290 --> 00:31:30,730
Good question.

669
00:31:30,730 --> 00:31:33,660
>> So let's forge ahead now, moving
away from side effects which

670
00:31:33,660 --> 00:31:36,660
are functions that do something like
visually on the screen with printf,

671
00:31:36,660 --> 00:31:38,090
but don't return a value.

672
00:31:38,090 --> 00:31:41,550
And functions that have return
values like we just saw a few of.

673
00:31:41,550 --> 00:31:45,350
We already saw this notion of scope,
and we'll see this again and again.

674
00:31:45,350 --> 00:31:47,210
But for now, again,
use the rule of thumb

675
00:31:47,210 --> 00:31:51,410
that a variable can only be used
inside of the most recently opened

676
00:31:51,410 --> 00:31:54,350
and closed curly braces as we
saw in that particular example.

677
00:31:54,350 --> 00:31:56,910
>> And as you pointed out,
there is an ability--

678
00:31:56,910 --> 00:32:00,040
you could solve some of these problems
by putting a variable globally

679
00:32:00,040 --> 00:32:01,290
at the very top of a file.

680
00:32:01,290 --> 00:32:03,630
But in almost all cases
we would frown upon that,

681
00:32:03,630 --> 00:32:06,170
and indeed not even go
into that solution for now.

682
00:32:06,170 --> 00:32:09,890
So for now, the takeaway is that
variables have this notion of scope.

683
00:32:09,890 --> 00:32:13,430
>> But now let's look at another
dry way of actually looking

684
00:32:13,430 --> 00:32:15,810
at some pretty interesting
implementation details.

685
00:32:15,810 --> 00:32:17,810
How we might represent information.

686
00:32:17,810 --> 00:32:20,370
And we already looked at this
in the first week of the class.

687
00:32:20,370 --> 00:32:23,320
Looking at binaries, and
reminding ourselves of decimal.

688
00:32:23,320 --> 00:32:28,310
>> But recall from last week that C has
different data types and bunches more,

689
00:32:28,310 --> 00:32:30,600
but the most useful ones
for now might be these.

690
00:32:30,600 --> 00:32:36,030
A char, or character, which happens
to be one byte, or eight bits total.

691
00:32:36,030 --> 00:32:40,060
And that's to say that the size
of a char is just one byte.

692
00:32:40,060 --> 00:32:45,370
A byte is eight bits, so this means that
we can represent how many characters.

693
00:32:45,370 --> 00:32:47,320
How many letters or
symbols on the keyboard

694
00:32:47,320 --> 00:32:49,210
if we have one byte or eight bits.

695
00:32:49,210 --> 00:32:51,546
Think back to week zero.

696
00:32:51,546 --> 00:32:53,420
If you have eight bits,
how many total values

697
00:32:53,420 --> 00:32:55,503
can you represent with
patterns of zeros and ones?

698
00:32:55,503 --> 00:32:58,170

699
00:32:58,170 --> 00:33:00,260
One-- more than that.

700
00:33:00,260 --> 00:33:03,490
So 256 total if you
start counting from zero.

701
00:33:03,490 --> 00:33:07,120
So if you have eight bits-- so if we
had our binary bulbs up here again,

702
00:33:07,120 --> 00:33:12,180
we could turn those light bulbs on
and off in any of 256 unique patterns.

703
00:33:12,180 --> 00:33:13,640
>> Now this is a bit problematic.

704
00:33:13,640 --> 00:33:16,857
Not so much for English and
romance languages, but certainly

705
00:33:16,857 --> 00:33:19,190
when you introduce, for
instance, Asian languages, which

706
00:33:19,190 --> 00:33:22,580
have far more symbols than like
26 letters of the alphabet.

707
00:33:22,580 --> 00:33:24,390
We actually might need
more than one byte.

708
00:33:24,390 --> 00:33:28,240
And thankfully in
recent years has society

709
00:33:28,240 --> 00:33:31,040
adopted other standards that use
more than one byte per charge.

710
00:33:31,040 --> 00:33:34,210
>> But for now in C, the default
is just one byte or eight bits.

711
00:33:34,210 --> 00:33:38,195
An integer, meanwhile, is four
bytes, otherwise known as 32 bits.

712
00:33:38,195 --> 00:33:41,320
Which means what's the largest possible
number we can represent with an int

713
00:33:41,320 --> 00:33:41,820
apparently?

714
00:33:41,820 --> 00:33:44,426

715
00:33:44,426 --> 00:33:45,050
With a billion.

716
00:33:45,050 --> 00:33:46,760
So it's four billion give or take.

717
00:33:46,760 --> 00:33:49,840
2 to the 32th power, if we
assume no negative numbers

718
00:33:49,840 --> 00:33:52,530
and just use all positive
numbers, it's four billion

719
00:33:52,530 --> 00:33:53,730
give or take possibilities.

720
00:33:53,730 --> 00:33:57,890
A float, meanwhile, is a different type
of data type in C. It's still a number,

721
00:33:57,890 --> 00:33:58,990
but it's a real number.

722
00:33:58,990 --> 00:34:00,660
Something with a decimal point.

723
00:34:00,660 --> 00:34:03,000
And it turns out that
C also uses four bytes

724
00:34:03,000 --> 00:34:05,340
to represent floating point values.

725
00:34:05,340 --> 00:34:09,420
>> Unfortunately how many floating
point values are there in the world?

726
00:34:09,420 --> 00:34:11,582
How many real numbers are there?

727
00:34:11,582 --> 00:34:13,540
There's an infinite
number, and for that matter

728
00:34:13,540 --> 00:34:15,164
there's an infinite number of integers.

729
00:34:15,164 --> 00:34:18,070
So we're already kind of
digging ourselves a hole here.

730
00:34:18,070 --> 00:34:21,780
Whereby apparently in computers-- at
least programs written in C on them--

731
00:34:21,780 --> 00:34:24,110
can only count as high as
four billion give or take,

732
00:34:24,110 --> 00:34:26,260
and floating point values
can only apparently

733
00:34:26,260 --> 00:34:28,330
have some finite amount of precision.

734
00:34:28,330 --> 00:34:30,810
Only so many digits after
their decimal point.

735
00:34:30,810 --> 00:34:32,822
>> Because, of course, if
you only have 32 bits,

736
00:34:32,822 --> 00:34:36,030
I don't know how we're going to go about
representing real numbers-- probably

737
00:34:36,030 --> 00:34:37,409
with different types of patterns.

738
00:34:37,409 --> 00:34:40,030
But there's surely a finite
number of such patterns,

739
00:34:40,030 --> 00:34:41,830
so here, too, this is problematic.

740
00:34:41,830 --> 00:34:43,710
>> Now we can avoid the problem slightly.

741
00:34:43,710 --> 00:34:45,710
If you don't use a float,
you could use a double

742
00:34:45,710 --> 00:34:50,230
in C, which gives you eight bytes, which
is way more possible patterns of zeros

743
00:34:50,230 --> 00:34:50,730
and ones.

744
00:34:50,730 --> 00:34:55,199
But it's still finite, which is going
to be problematic if you write software

745
00:34:55,199 --> 00:34:57,670
for graphics or for fancy
mathematical formulas.

746
00:34:57,670 --> 00:35:00,410
So you might actually want
to count up bigger than that.

747
00:35:00,410 --> 00:35:05,640
A long long-- stupidly named--
is also eight bytes, or 64 bits,

748
00:35:05,640 --> 00:35:10,260
and this is twice as long as an int,
and it's for a long integer value.

749
00:35:10,260 --> 00:35:15,655
>> Fun fact-- if an int is four bytes,
how long is a long in C typically?

750
00:35:15,655 --> 00:35:18,290

751
00:35:18,290 --> 00:35:21,560
Also four bytes, but a
long long is eight bytes,

752
00:35:21,560 --> 00:35:23,050
and this is for historical reasons.

753
00:35:23,050 --> 00:35:26,450
>> But the takeaway now
is just that data has

754
00:35:26,450 --> 00:35:29,625
to be represented in a computer-- that's
a physical device with electricity,

755
00:35:29,625 --> 00:35:32,190
it's generally driving
those zeros and ones--

756
00:35:32,190 --> 00:35:34,320
with finite amounts of precision.

757
00:35:34,320 --> 00:35:35,620
So what's the problem then?

758
00:35:35,620 --> 00:35:37,480
>> Well there's a problem
of integer overflow.

759
00:35:37,480 --> 00:35:39,780
Not just in C, but in
computers in general.

760
00:35:39,780 --> 00:35:42,590
For instance, if this
is a byte worth a bit--

761
00:35:42,590 --> 00:35:45,120
so if this is eight bit-- all
of which are the number one.

762
00:35:45,120 --> 00:35:47,300
What number is this
representing if we assume

763
00:35:47,300 --> 00:35:50,730
it's all positive values in binary?

764
00:35:50,730 --> 00:35:54,410
>> 255, and it's not 256, because
zero is the lowest number.

765
00:35:54,410 --> 00:35:56,760
So 255 is the highest
one, but the problem

766
00:35:56,760 --> 00:36:00,330
is suppose that I wanted to
increment this variable that

767
00:36:00,330 --> 00:36:04,030
is using eight bits total
if I want to increment it.

768
00:36:04,030 --> 00:36:07,160
>> Well as soon as I add a
one to all of these ones,

769
00:36:07,160 --> 00:36:10,500
you can perhaps imagine visually-- just
like carrying the one using decimals--

770
00:36:10,500 --> 00:36:12,300
something's going to flow to the left.

771
00:36:12,300 --> 00:36:15,590
And indeed, if I add the number
one to this, what happens in binary

772
00:36:15,590 --> 00:36:17,670
is that it overflows back to zero.

773
00:36:17,670 --> 00:36:21,730
>> So if you only use-- not an int,
but a single byte to count integers

774
00:36:21,730 --> 00:36:27,170
in a program, by default-- as soon as
you get to 250, 251, 252, 253, 254,

775
00:36:27,170 --> 00:36:32,710
255-- 0 comes after 255,
which is probably not what

776
00:36:32,710 --> 00:36:34,790
a user is going to expect.

777
00:36:34,790 --> 00:36:39,620
>> Now meanwhile in floating point world,
you also have a similar problem.

778
00:36:39,620 --> 00:36:42,670
Not so much with the largest number--
although that's still an issue.

779
00:36:42,670 --> 00:36:45,360
But with the amount of precision
that you can represent.

780
00:36:45,360 --> 00:36:49,490
So let's take a look at this example
here also from today's source code--

781
00:36:49,490 --> 00:36:52,070
float-0.c.

782
00:36:52,070 --> 00:36:54,280
>> And notice it's a super
simple program that

783
00:36:54,280 --> 00:36:56,580
should apparently print out what value?

784
00:36:56,580 --> 00:37:00,777

785
00:37:00,777 --> 00:37:04,110
What do you wager this is going to print
even though there's a bit of new syntax

786
00:37:04,110 --> 00:37:05,540
here?

787
00:37:05,540 --> 00:37:06,700
So hopefully 0.1.

788
00:37:06,700 --> 00:37:10,000
So the equivalent of one-tenth
because I'm doing 1 divided by 10.

789
00:37:10,000 --> 00:37:12,430
I'm storing the answer
in a variable called f.

790
00:37:12,430 --> 00:37:15,850
That variable is of type float, which
is a keyword I just proposed existed.

791
00:37:15,850 --> 00:37:18,910
>> We've not seen this before, but
this is kind of a neat way in printf

792
00:37:18,910 --> 00:37:22,110
to specify how many digits you
want to see after a decimal point.

793
00:37:22,110 --> 00:37:25,020
So this notation just means
that here's a placeholder.

794
00:37:25,020 --> 00:37:27,900
It's for a floating point
value, and oh, by the way,

795
00:37:27,900 --> 00:37:31,389
show it with the decimal point with
one number after the decimal point.

796
00:37:31,389 --> 00:37:33,180
So that's the number
of significant digits,

797
00:37:33,180 --> 00:37:34,650
so to speak, that you might want.

798
00:37:34,650 --> 00:37:40,450
>> So let me go ahead and do
make float-0, ./float-0,

799
00:37:40,450 --> 00:37:46,660
and apparently 1 divided by 10 is 0.0.

800
00:37:46,660 --> 00:37:47,760
Now why is this?

801
00:37:47,760 --> 00:37:51,380
>> Well again, the computer is taking
me literally, and I have written 1

802
00:37:51,380 --> 00:37:56,680
and I written 10, and take a guess what
is the assumed data type for those two

803
00:37:56,680 --> 00:37:58,440
values?

804
00:37:58,440 --> 00:38:00,970
An int, it's technically
something a little different.

805
00:38:00,970 --> 00:38:04,150
It's typically a long, but it's
ultimately an integral value.

806
00:38:04,150 --> 00:38:06,030
Not a floating point value.

807
00:38:06,030 --> 00:38:09,456
>> Which is to say that if this
is an int and this is an int,

808
00:38:09,456 --> 00:38:11,830
the problem is that the computer
doesn't have the ability

809
00:38:11,830 --> 00:38:13,680
to even store that decimal point.

810
00:38:13,680 --> 00:38:16,430
So when you do 1 divided
by 10 using integers

811
00:38:16,430 --> 00:38:20,950
for both the numerator and the
denominator, the answer should be 0.1.

812
00:38:20,950 --> 00:38:24,930
But the computer-- because
those are integers--

813
00:38:24,930 --> 00:38:27,430
doesn't know what to do with the 0.1.

814
00:38:27,430 --> 00:38:30,010
>> So what is it clearly doing?

815
00:38:30,010 --> 00:38:33,120
It's just throwing it away,
and what I'm seeing ultimately

816
00:38:33,120 --> 00:38:38,830
is 0.0 only because I insisted that
printf show me one decimal point.

817
00:38:38,830 --> 00:38:41,740
But the problem is that if you
divide an integer by an integer,

818
00:38:41,740 --> 00:38:44,347
you will get-- by definition
of C-- an integer.

819
00:38:44,347 --> 00:38:46,680
And it's not going to do
something nice and conveniently

820
00:38:46,680 --> 00:38:49,040
like round it up to the
nearest one up or down.

821
00:38:49,040 --> 00:38:51,860
It's going to truncate
everything after the decimal.

822
00:38:51,860 --> 00:38:54,030
>> So just intuitively,
what's probably a fix?

823
00:38:54,030 --> 00:38:55,351
What's the simplest fix here?

824
00:38:55,351 --> 00:38:55,850
Yeah?

825
00:38:55,850 --> 00:39:00,570

826
00:39:00,570 --> 00:39:01,100
Exactly.

827
00:39:01,100 --> 00:39:04,200
Why don't we just treat these as
floating point values effectively

828
00:39:04,200 --> 00:39:05,860
turning them into floats or doubles.

829
00:39:05,860 --> 00:39:10,500
And now if I do make floats-0,
or if I compile floats-1,

830
00:39:10,500 --> 00:39:12,570
which is identical to
what was just proposed.

831
00:39:12,570 --> 00:39:16,400
And now I do floats-0, now I get my 0.1.

832
00:39:16,400 --> 00:39:17,234
>> Now this is amazing.

833
00:39:17,234 --> 00:39:19,441
But now I'm going to do
something a little different.

834
00:39:19,441 --> 00:39:22,280
I'm curious to see what's really
going on underneath the hood,

835
00:39:22,280 --> 00:39:26,050
and I'm going to print this
out to 28 decimal places.

836
00:39:26,050 --> 00:39:29,730
I want to really see
0.1000-- an infinite--

837
00:39:29,730 --> 00:39:32,710
[INAUDIBLE] 27 zeros after that 0.1.

838
00:39:32,710 --> 00:39:34,740
>> Well let's see if that's
what I indeed get.

839
00:39:34,740 --> 00:39:39,430
Make floats-0 same file.

840
00:39:39,430 --> 00:39:41,150
./floats-0.

841
00:39:41,150 --> 00:39:44,380
Let's zoom in on the dramatic answer.

842
00:39:44,380 --> 00:39:49,980
All this time, you've been thinking
1 divided by 10 is 10%, or 0.1.

843
00:39:49,980 --> 00:39:50,810
It's not.

844
00:39:50,810 --> 00:39:53,210
At least so far as the
computer's concerned.

845
00:39:53,210 --> 00:39:57,060
>> Now why-- OK, that's complete
lie 1 divided by 10 is 0.1.

846
00:39:57,060 --> 00:39:59,710
But why-- that is not
the takeaway today.

847
00:39:59,710 --> 00:40:04,010
So why does the computer think,
unlike all of us in the room,

848
00:40:04,010 --> 00:40:06,870
that 1 divided by 10 is
actually that crazy value?

849
00:40:06,870 --> 00:40:10,620
What's the computer doing apparently?

850
00:40:10,620 --> 00:40:12,490
What's that?

851
00:40:12,490 --> 00:40:13,785
>> It's not overflow, per se.

852
00:40:13,785 --> 00:40:15,910
Overflow is typically when
you wrap around a value.

853
00:40:15,910 --> 00:40:18,970
It's this issue of imprecision
in a floating point value

854
00:40:18,970 --> 00:40:22,220
where you only have 32
or maybe even 64 bit.

855
00:40:22,220 --> 00:40:25,230
But if there's an infinite
number of real numbers--

856
00:40:25,230 --> 00:40:27,940
numbers with decimal points
and numbers thereafter-- surely

857
00:40:27,940 --> 00:40:29,380
you can't represent all of them.

858
00:40:29,380 --> 00:40:32,870
So the computer has given
us the closest match

859
00:40:32,870 --> 00:40:37,090
to the value it can represent using that
many bits to the value I actually want,

860
00:40:37,090 --> 00:40:38,690
which is 0.1.

861
00:40:38,690 --> 00:40:40,685
>> Unfortunately, if you
start doing math, or you

862
00:40:40,685 --> 00:40:44,360
start involving these kinds of floating
point values in important programs--

863
00:40:44,360 --> 00:40:46,770
financial software,
military software-- anything

864
00:40:46,770 --> 00:40:49,090
where perception is
probably pretty important.

865
00:40:49,090 --> 00:40:51,520
And you start adding
numbers like this, and start

866
00:40:51,520 --> 00:40:54,050
running that software
with really large inputs

867
00:40:54,050 --> 00:40:56,890
or for lots of hours or lots
of days or lots of years,

868
00:40:56,890 --> 00:41:01,060
these tiny little mistakes
surely can add up over time.

869
00:41:01,060 --> 00:41:04,252
>> Now as an aside, if you've ever
seen Superman 3 or Office Space

870
00:41:04,252 --> 00:41:05,960
and you might recall
how those guys stole

871
00:41:05,960 --> 00:41:08,668
a lot of money from their computer
by using floating point values

872
00:41:08,668 --> 00:41:11,290
and adding up the little
remainders, hopefully that movie

873
00:41:11,290 --> 00:41:12,390
now makes more sense.

874
00:41:12,390 --> 00:41:14,930
This is what they were
alluding to in that movie.

875
00:41:14,930 --> 00:41:16,710
The fact that most
companies wouldn't look

876
00:41:16,710 --> 00:41:18,600
after a certain number
of decimal places,

877
00:41:18,600 --> 00:41:20,009
but those are fractions of cents.

878
00:41:20,009 --> 00:41:22,550
So you start adding them up,
you start to make a lot of money

879
00:41:22,550 --> 00:41:23,424
in your bank account.

880
00:41:23,424 --> 00:41:25,160
So that's Office Space explained.

881
00:41:25,160 --> 00:41:28,220
>> Now unfortunately beyond
Office Space, there

882
00:41:28,220 --> 00:41:31,794
are some legitimately troubling
and significant impacts

883
00:41:31,794 --> 00:41:33,710
of these kinds of
underlying design decisions,

884
00:41:33,710 --> 00:41:35,990
and indeed one of the reasons
we use C in the course

885
00:41:35,990 --> 00:41:39,640
is so that you really have this ground
up understanding of how computers work,

886
00:41:39,640 --> 00:41:42,440
how software works, and don't
take anything for granted.

887
00:41:42,440 --> 00:41:45,820
>> And indeed unfortunately, even with
that fundamental understanding,

888
00:41:45,820 --> 00:41:47,370
we humans make mistakes.

889
00:41:47,370 --> 00:41:51,310
And what I thought I'd share is
this eight minute video here taken

890
00:41:51,310 --> 00:41:56,980
from a Modern Marvels episode, which is
an educational show on how things work

891
00:41:56,980 --> 00:42:00,370
that paints two pictures
of when an improper use

892
00:42:00,370 --> 00:42:02,540
and understanding of
floating point values

893
00:42:02,540 --> 00:42:05,610
led to some significant
unfortunate results.

894
00:42:05,610 --> 00:42:06,363
Let's take a look.

895
00:42:06,363 --> 00:42:07,029
[VIDEO PLAYBACK]

896
00:42:07,029 --> 00:42:11,290
-We now return to "Engineering
Disasters" on Modern Marvels.

897
00:42:11,290 --> 00:42:12,940
Computers.

898
00:42:12,940 --> 00:42:15,580
We've all come to accept the
often frustrating problems that

899
00:42:15,580 --> 00:42:20,960
got with them-- bugs, viruses, and
software glitches-- for small prices

900
00:42:20,960 --> 00:42:23,100
to pay for the convenience.

901
00:42:23,100 --> 00:42:27,770
But in high tech and high speed
military and space program applications,

902
00:42:27,770 --> 00:42:32,780
the smallest problem can
be magnified into disaster.

903
00:42:32,780 --> 00:42:38,880
>> On June 4, 1996, scientists prepared
to launch an unmanned Ariane 5 rocket.

904
00:42:38,880 --> 00:42:41,190
It was carrying scientific
satellites designed

905
00:42:41,190 --> 00:42:44,570
to establish precisely how the
Earth's magnetic field interacts

906
00:42:44,570 --> 00:42:47,380
with solar winds.

907
00:42:47,380 --> 00:42:50,580
The rocket was built for
the European Space Agency,

908
00:42:50,580 --> 00:42:54,400
and lifted off from its facility
on the coast of French Guiana.

909
00:42:54,400 --> 00:42:57,520
>> -At about 37 seconds into
the flight, they first

910
00:42:57,520 --> 00:42:59,070
noticed something was going wrong.

911
00:42:59,070 --> 00:43:02,240
That the nozzles were swiveling
in a way they really shouldn't.

912
00:43:02,240 --> 00:43:06,550
Around 40 seconds into the flight,
clearly the vehicle was in trouble,

913
00:43:06,550 --> 00:43:08,820
and that's when they made
the decision to destroy it.

914
00:43:08,820 --> 00:43:12,370
The range safety officer, with
tremendous guts, pressed the button

915
00:43:12,370 --> 00:43:18,030
and blew up the rocket before it could
become a hazard to public safety.

916
00:43:18,030 --> 00:43:21,010
>> -This was the maiden
voyage of the Ariane 5,

917
00:43:21,010 --> 00:43:23,920
and its destruction took
place because of the flaw

918
00:43:23,920 --> 00:43:25,932
embedded in the rocket's software.

919
00:43:25,932 --> 00:43:27,640
-The problem on the
Ariane was that there

920
00:43:27,640 --> 00:43:30,500
was a number that required
64 bits to express,

921
00:43:30,500 --> 00:43:33,560
and they wanted to convert
it to a 16-bit number.

922
00:43:33,560 --> 00:43:36,820
They assumed that the number
was never going to be very big.

923
00:43:36,820 --> 00:43:40,940
That most of those digits in
the 64-bit number were zeros.

924
00:43:40,940 --> 00:43:42,450
They were wrong.

925
00:43:42,450 --> 00:43:45,000
>> -The inability of one
software program to accept

926
00:43:45,000 --> 00:43:49,460
the kind of number generated by
another was at the root of the failure.

927
00:43:49,460 --> 00:43:54,260
Software development had become a
very costly part of new technology.

928
00:43:54,260 --> 00:43:57,060
The Ariane 4 rocket had
been very successful.

929
00:43:57,060 --> 00:44:01,600
So much of the software created for
it was also used in the Ariane 5.

930
00:44:01,600 --> 00:44:04,790
>> -The basic problem
was that the Ariane 5.

931
00:44:04,790 --> 00:44:11,200
Was faster-- accelerated faster, and
the software hadn't accounted for that.

932
00:44:11,200 --> 00:44:14,910
>> -The destruction of the rocket
was a huge financial disaster.

933
00:44:14,910 --> 00:44:18,630
All due to a minute software error.

934
00:44:18,630 --> 00:44:21,160
But this wasn't the first
time data conversion problems

935
00:44:21,160 --> 00:44:24,770
had plagued modern rocket technology.

936
00:44:24,770 --> 00:44:28,020
>> -In 1991 with the start
of the first Gulf War,

937
00:44:28,020 --> 00:44:30,540
the Patriot missile
experienced a similar kind

938
00:44:30,540 --> 00:44:32,465
of a number conversion problem.

939
00:44:32,465 --> 00:44:36,760
And as a result 28 people-- 28
American soldiers-- were killed,

940
00:44:36,760 --> 00:44:39,010
and about a hundred others wounded.

941
00:44:39,010 --> 00:44:42,830
When the Patriot, which was supposed
to protect against incoming Scuds,

942
00:44:42,830 --> 00:44:45,780
failed to fire a missile.

943
00:44:45,780 --> 00:44:51,610
>> -When Iraq invaded Kuwait, and America
launched Desert Storm in early 1991,

944
00:44:51,610 --> 00:44:55,720
Patriot missile batteries were deployed
to protect Saudi Arabia and Israel

945
00:44:55,720 --> 00:44:59,180
from Iraqi Scud missile attacks.

946
00:44:59,180 --> 00:45:03,080
The Patriot is a US medium-range
surface-to-air system

947
00:45:03,080 --> 00:45:06,530
manufactured by the Raytheon company.

948
00:45:06,530 --> 00:45:09,500
>> -The size of the Patriot
interceptor itself--

949
00:45:09,500 --> 00:45:14,705
it's about roughly 20 feet long,
and it weighs about 2,000 pounds.

950
00:45:14,705 --> 00:45:19,090
And it carries a warhead of about,
I think it's roughly 150 pounds.

951
00:45:19,090 --> 00:45:23,880
And the warhead itself is
a high explosive, which

952
00:45:23,880 --> 00:45:26,700
has fragments around him.

953
00:45:26,700 --> 00:45:31,630
So the casing of the warhead is
designed to act like a buckshot.

954
00:45:31,630 --> 00:45:34,040
>> -The missiles are carried
four per container,

955
00:45:34,040 --> 00:45:37,170
and are transported by a semi trailer.

956
00:45:37,170 --> 00:45:44,880
>> -The Patriot anti-missile system
goes back at least 20 years now.

957
00:45:44,880 --> 00:45:48,380
It was originally designed
as an air defense missile

958
00:45:48,380 --> 00:45:50,810
to shoot down enemy airplanes.

959
00:45:50,810 --> 00:45:54,410
In the first Gulf War
when that war came on,

960
00:45:54,410 --> 00:45:59,650
the Army wanted to use it to
shoot down Scuds, not airplanes.

961
00:45:59,650 --> 00:46:03,580
The Iraqi Air Force was
not so much of a problem,

962
00:46:03,580 --> 00:46:06,590
but the Army was worried about Scuds.

963
00:46:06,590 --> 00:46:10,120
And so they tried to
upgrade the Patriot.

964
00:46:10,120 --> 00:46:12,740
>> -Intercepting an enemy
missile traveling at Mach 5

965
00:46:12,740 --> 00:46:15,670
was going to be challenging enough.

966
00:46:15,670 --> 00:46:18,440
But when the Patriot
was rushed into service,

967
00:46:18,440 --> 00:46:22,580
the Army was not aware of
an Iraqi modification that

968
00:46:22,580 --> 00:46:25,880
made their scuds nearly
impossible to it.

969
00:46:25,880 --> 00:46:30,690
>> -What happened is the Scuds that
were coming in were unstable.

970
00:46:30,690 --> 00:46:32,000
They were wobbly.

971
00:46:32,000 --> 00:46:37,210
The reason for this was the Iraqis--
in order to get 600 kilometers out

972
00:46:37,210 --> 00:46:41,680
of a 300-kilometer range missile--
took weight out of the front warhead,

973
00:46:41,680 --> 00:46:43,340
and made the warhead lighter.

974
00:46:43,340 --> 00:46:48,490
So now the Patriot's trying to come
at the Scud, and most of the time--

975
00:46:48,490 --> 00:46:52,880
the overwhelming majority of the
time-- it would just fly by the Scud.

976
00:46:52,880 --> 00:46:57,120
>> -Once the Patriot system operators
realized the Patriot missed its target,

977
00:46:57,120 --> 00:47:01,630
they detonated the Patriot's warhead
to avoid possible casualties if it

978
00:47:01,630 --> 00:47:04,440
was allowed to fall to the ground.

979
00:47:04,440 --> 00:47:08,700
>> -That was what most people saw
as big fireballs in the sky,

980
00:47:08,700 --> 00:47:14,180
and misunderstood as
intercepts of Scud warheads.

981
00:47:14,180 --> 00:47:18,020
>> -Although in the night skies, Patriots
appeared to be successfully destroying

982
00:47:18,020 --> 00:47:23,280
Scuds, at Dhahran there could be
no mistake about its performance.

983
00:47:23,280 --> 00:47:27,930
There the Patriot's radar system
lost track of an incoming Scud

984
00:47:27,930 --> 00:47:30,260
and never launched due
to a software flaw.

985
00:47:30,260 --> 00:47:34,060

986
00:47:34,060 --> 00:47:38,880
>> It was the Israelis who first discovered
that the longer the system was on,

987
00:47:38,880 --> 00:47:41,130
the greater the time discrepancy became.

988
00:47:41,130 --> 00:47:44,770
Due to a clock embedded
in the system's computer.

989
00:47:44,770 --> 00:47:48,190
>> -About two weeks before
the tragedy in Dhahran,

990
00:47:48,190 --> 00:47:50,720
the Israelis reported to
the Defense Department

991
00:47:50,720 --> 00:47:52,410
that the system was losing time.

992
00:47:52,410 --> 00:47:54,410
After about eight hours
of running, they noticed

993
00:47:54,410 --> 00:47:57,690
that the system's becoming
noticeably less accurate.

994
00:47:57,690 --> 00:48:01,850
The Defense Department responded by
telling all of the Patriot batteries

995
00:48:01,850 --> 00:48:04,800
to not leave the systems
on for a long time.

996
00:48:04,800 --> 00:48:06,980
They never said what a long time was.

997
00:48:06,980 --> 00:48:09,140
8 hours, 10 hours, a thousand hours.

998
00:48:09,140 --> 00:48:11,300
Nobody knew.

999
00:48:11,300 --> 00:48:13,320
>> -The Patriot battery
stationed at the barracks

1000
00:48:13,320 --> 00:48:18,310
at Dhahran and its flawed internal
clock had been on for over 100 hours

1001
00:48:18,310 --> 00:48:21,520
on the night of February 25.

1002
00:48:21,520 --> 00:48:25,792
>> -It tracked time to an accuracy
of about a tenth of a second.

1003
00:48:25,792 --> 00:48:27,950
Now a tenth of a second
is an interesting number

1004
00:48:27,950 --> 00:48:31,850
because it can't be expressed
in binary exactly, which

1005
00:48:31,850 --> 00:48:36,500
means it can't be expressed exactly
in any modern digital computer.

1006
00:48:36,500 --> 00:48:41,070
It's hard to believe, but
use this as an example.

1007
00:48:41,070 --> 00:48:43,420
>> Let's take the number one third.

1008
00:48:43,420 --> 00:48:47,330
One third cannot be
expressed in decimal exactly.

1009
00:48:47,330 --> 00:48:52,060
One third is 0.333
going on for infinity.

1010
00:48:52,060 --> 00:48:56,420
There's no way to do that with
absolute accuracy in a decimal.

1011
00:48:56,420 --> 00:48:59,530
That's exactly the kind of problem
that happened in the Patriot.

1012
00:48:59,530 --> 00:49:04,040
The longer the system ran, the
worse the time error became.

1013
00:49:04,040 --> 00:49:08,840
>> -After 100 hours of operation, the
error in time was only about one third

1014
00:49:08,840 --> 00:49:10,440
of a second.

1015
00:49:10,440 --> 00:49:14,150
But in terms of targeting a
missile traveling at Mach 5,

1016
00:49:14,150 --> 00:49:18,560
it resulted in a tracking
error of over 600 meters.

1017
00:49:18,560 --> 00:49:21,870
It would be a fatal error
for the soldiers at Dhahran.

1018
00:49:21,870 --> 00:49:28,455
>> -What happened is a Scud launch was
detected by early warning satellites,

1019
00:49:28,455 --> 00:49:32,710
and they knew a Scud was coming
in their general direction.

1020
00:49:32,710 --> 00:49:35,150
They didn't know where it was coming.

1021
00:49:35,150 --> 00:49:38,210
It was now up to the radar
component of the Patriot system

1022
00:49:38,210 --> 00:49:43,150
defending Dhahran to locate and keep
track of the incoming enemy missile.

1023
00:49:43,150 --> 00:49:44,561
>> -The radar was very smart.

1024
00:49:44,561 --> 00:49:46,560
It would actually track
the position of the Scud

1025
00:49:46,560 --> 00:49:48,930
and then predict where
it probably would be

1026
00:49:48,930 --> 00:49:51,380
the next time the
radar sent a pulse out.

1027
00:49:51,380 --> 00:49:53,040
That was called the range gate.

1028
00:49:53,040 --> 00:49:57,620
>> -Then once the Patriot
decides enough time has

1029
00:49:57,620 --> 00:50:02,400
passed to go back and check the next
location for this detected object

1030
00:50:02,400 --> 00:50:03,550
it goes back.

1031
00:50:03,550 --> 00:50:07,820
So when it went back to the wrong
place, it then sees no object.

1032
00:50:07,820 --> 00:50:10,360
And it decides that there was no object.

1033
00:50:10,360 --> 00:50:13,630
That there was a false detection
and it drops the track.

1034
00:50:13,630 --> 00:50:16,970
>> -The incoming Scud disappeared
from the radar screen,

1035
00:50:16,970 --> 00:50:20,200
and seconds later, it
slammed into the barracks.

1036
00:50:20,200 --> 00:50:22,570
The Scud killed 28.

1037
00:50:22,570 --> 00:50:26,110
It was the last one fired
during the first Gulf War.

1038
00:50:26,110 --> 00:50:31,920
Tragically, the updated software
arrived at dawn on the following day.

1039
00:50:31,920 --> 00:50:34,870
The software flaw had
been fixed, closing

1040
00:50:34,870 --> 00:50:39,150
one chapter in the troubled
history of the Patriot missile.

1041
00:50:39,150 --> 00:50:40,030
>> [END VIDEO PLAYBACK]

1042
00:50:40,030 --> 00:50:41,488
>> DAVID J. MALAN: That's it for CS50.

1043
00:50:41,488 --> 00:50:42,820
We will see you on Wednesday.

1044
00:50:42,820 --> 00:50:46,420

1045
00:50:46,420 --> 00:50:50,370
>> [MUSIC PLAYING]

1046
00:50:50,370 --> 00:54:23,446