1
00:00:00,000 --> 00:00:10,920
>> [MUSIC PLAYING]

2
00:00:10,920 --> 00:00:14,680
>> DAVID J MALAN: All right,
welcome back to CS50.

3
00:00:14,680 --> 00:00:16,500
This is the start of week two.

4
00:00:16,500 --> 00:00:18,940
A word from one of our
friends on campus--

5
00:00:18,940 --> 00:00:22,620
if you are interested, possibly, either
now or in some future term

6
00:00:22,620 --> 00:00:25,670
even, once more comfortable, teaching
middle school students a little

7
00:00:25,670 --> 00:00:27,680
something about computer science,
do head to that URL.

8
00:00:27,680 --> 00:00:32,360
They are in particular need right now of
teachers, particularly if you have

9
00:00:32,360 --> 00:00:34,700
had some exposure to computer science.

10
00:00:34,700 --> 00:00:38,060
>> So recall that last time, we introduced
a few data types in C, and

11
00:00:38,060 --> 00:00:40,590
you may have started to get your
hands dirty with these thus far

12
00:00:40,590 --> 00:00:41,940
in problem set one.

13
00:00:41,940 --> 00:00:43,230
And we had a char.

14
00:00:43,230 --> 00:00:49,100
So in somewhat technical terms, what
is a char as you know it today?

15
00:00:49,100 --> 00:00:51,050
>> So it's a character, but let's
be more precise now.

16
00:00:51,050 --> 00:00:53,735
What do we mean by character
or individual char?

17
00:00:53,735 --> 00:00:56,700

18
00:00:56,700 --> 00:00:59,500
A non-numerical character--

19
00:00:59,500 --> 00:01:00,670
so not necessarily.

20
00:01:00,670 --> 00:01:04,580
It turns out that even numbers, even
punctuation and letters are

21
00:01:04,580 --> 00:01:06,980
represented with this data
type known as a char.

22
00:01:06,980 --> 00:01:09,440
So it's not necessarily alphabetical.

23
00:01:09,440 --> 00:01:11,100
Yeah?

24
00:01:11,100 --> 00:01:12,275
>> So it's an ASCII character.

25
00:01:12,275 --> 00:01:15,510
So if you think back to week zero, when
we had our byte of volunteers

26
00:01:15,510 --> 00:01:19,150
come up and either hold their hands up
or not all, they represented bits.

27
00:01:19,150 --> 00:01:22,450
But collectively as a group of eight,
they represented a byte.

28
00:01:22,450 --> 00:01:26,030
And we introduced the notion of ASCII
at that lecture, which simply is a

29
00:01:26,030 --> 00:01:28,170
mapping between numbers and letters.

30
00:01:28,170 --> 00:01:32,010
And ASCII uses, as those humans
implied, eight bits

31
00:01:32,010 --> 00:01:33,660
to represent a character.

32
00:01:33,660 --> 00:01:36,890
>> So accordingly, if eight bits can
each take on one of two values--

33
00:01:36,890 --> 00:01:38,010
zero or one--

34
00:01:38,010 --> 00:01:40,280
that means there were two possibilities
for this person--

35
00:01:40,280 --> 00:01:41,230
zero or one--

36
00:01:41,230 --> 00:01:44,070
two for this person, two for this
person, two for this one.

37
00:01:44,070 --> 00:01:47,450
So a total of two times two times
two times two times two--

38
00:01:47,450 --> 00:01:49,700
so two the eighth in total.

39
00:01:49,700 --> 00:01:54,320
So there's a total number of characters
256 possible that you can

40
00:01:54,320 --> 00:01:55,750
represent with eight bits.

41
00:01:55,750 --> 00:01:59,210
>> Now, those of you who speak Asian
languages might know that there's more

42
00:01:59,210 --> 00:02:02,620
characters in the world than just
As and Bs and Cs and Ds.

43
00:02:02,620 --> 00:02:06,130
And indeed, ASCII does not suffice for
a lot of languages of the world.

44
00:02:06,130 --> 00:02:07,760
But more on that another time.

45
00:02:07,760 --> 00:02:11,240
For now, know that in C if you want
to represent a letter, a piece of

46
00:02:11,240 --> 00:02:15,780
punctuation, or just something character
in nature, we use a char.

47
00:02:15,780 --> 00:02:18,240
And it's one byte or eight bits.

48
00:02:18,240 --> 00:02:19,690
>> How about an int?

49
00:02:19,690 --> 00:02:20,780
Well, an int is an integer.

50
00:02:20,780 --> 00:02:23,175
How many bits, if you recall,
was an integer typically?

51
00:02:23,175 --> 00:02:25,930

52
00:02:25,930 --> 00:02:27,512
Anyone recall?

53
00:02:27,512 --> 00:02:29,600
So it's typically 32.

54
00:02:29,600 --> 00:02:32,120
It actually depends on the computer
that you're using.

55
00:02:32,120 --> 00:02:35,770
But in the appliance, and in a lot of
computers, it's 32 bits or four

56
00:02:35,770 --> 00:02:37,140
bytes-- eight times four.

57
00:02:37,140 --> 00:02:39,790
And ints are just used for storing
numbers, either negative,

58
00:02:39,790 --> 00:02:41,610
positive, or zero.

59
00:02:41,610 --> 00:02:45,250
>> And if you've got 32 bits and you only
care about positive numbers, can

60
00:02:45,250 --> 00:02:48,960
anyone ballpark how many possible
integers a computer can represent from

61
00:02:48,960 --> 00:02:51,820
zero on up?

62
00:02:51,820 --> 00:02:56,130
So it would be two to the 32, which
is roughly four billion.

63
00:02:56,130 --> 00:02:59,720
So these powers of two are going to be
recurring themes in computer science.

64
00:02:59,720 --> 00:03:03,930
As we'll see, they're quite convenient
to work with even if it's not quite

65
00:03:03,930 --> 00:03:05,790
easy to do the math in one's head.

66
00:03:05,790 --> 00:03:07,000
>> So we'll say roughly four billion.

67
00:03:07,000 --> 00:03:08,620
Now, a long long--

68
00:03:08,620 --> 00:03:09,770
you can kind of guess.

69
00:03:09,770 --> 00:03:10,480
It's longer than an int.

70
00:03:10,480 --> 00:03:12,440
How many bits?

71
00:03:12,440 --> 00:03:14,250
So 64 bits or eight bytes.

72
00:03:14,250 --> 00:03:17,480
This just means you can represent even
bigger numbers, bigger positive or

73
00:03:17,480 --> 00:03:19,160
bigger negative numbers.

74
00:03:19,160 --> 00:03:20,060
>> And how about float?

75
00:03:20,060 --> 00:03:22,260
That's a floating point
value of 32 bits.

76
00:03:22,260 --> 00:03:25,180
This is just a real number, something
with a decimal point.

77
00:03:25,180 --> 00:03:30,100
But if you instead need more places
after the decimal point or you want to

78
00:03:30,100 --> 00:03:33,720
represent a bigger number with some
fraction after it, you can use a

79
00:03:33,720 --> 00:03:36,260
double, which is 64 bits.

80
00:03:36,260 --> 00:03:38,240
>> But there's an interesting
takeaway here.

81
00:03:38,240 --> 00:03:42,890
So if ints are limited by 32 bits and
even long longs are limited by 64

82
00:03:42,890 --> 00:03:46,180
bits, that sort of begs the question,
what if you actually want to count

83
00:03:46,180 --> 00:03:48,790
higher than 4 billion for an int?

84
00:03:48,790 --> 00:03:50,330
Well, you just use a long long.

85
00:03:50,330 --> 00:03:54,200
But what if you want to count higher
than two to the 64th, give or take?

86
00:03:54,200 --> 00:03:55,810
>> Now, that's a huge number.

87
00:03:55,810 --> 00:03:59,250
But eventually, you might actually
care about these kinds of values,

88
00:03:59,250 --> 00:04:03,070
especially if you are using a database
and starting to collect lots and lots

89
00:04:03,070 --> 00:04:06,190
and lots of data and assigning unique
numbers to each piece of that data.

90
00:04:06,190 --> 00:04:07,430
So we kind of have a problem.

91
00:04:07,430 --> 00:04:10,700
And similarly, with floating point
values-- floats or doubles--

92
00:04:10,700 --> 00:04:14,290
if you've only got a finite number of
bits, how many total numbers could you

93
00:04:14,290 --> 00:04:16,980
possibly represent?

94
00:04:16,980 --> 00:04:19,540
>> Well, it's less clear when you
involve a decimal point.

95
00:04:19,540 --> 00:04:20,899
But it's surely finite.

96
00:04:20,899 --> 00:04:24,390
If you have a finite number of bits,
a finite number of humans, a finite

97
00:04:24,390 --> 00:04:27,350
number of light bulbs, surely you can
only represent a finite number of

98
00:04:27,350 --> 00:04:28,510
floating point values.

99
00:04:28,510 --> 00:04:33,170
But how many real numbers
are their in the world?

100
00:04:33,170 --> 00:04:33,680
There's an infinite.

101
00:04:33,680 --> 00:04:37,280
So that's kind of a problem because we
don't have an infinite amount of

102
00:04:37,280 --> 00:04:39,970
memory or RAM inside of our computers.

103
00:04:39,970 --> 00:04:41,780
So some challenging things can happen.

104
00:04:41,780 --> 00:04:43,900
>> So let's go ahead and try
to express this here.

105
00:04:43,900 --> 00:04:46,240
Let me go ahead and open up gedit.

106
00:04:46,240 --> 00:04:50,360
I'm going to go ahead and save a file
called "floats0.c" just to be

107
00:04:50,360 --> 00:04:54,630
consistent with an example that is
available online, if you would like.

108
00:04:54,630 --> 00:04:58,080
And I'm going to go ahead and
define it as follows--

109
00:04:58,080 --> 00:05:01,540
I'm going to go ahead and say, int
main void, as we often do.

110
00:05:01,540 --> 00:05:07,190
>> And then in this program, I'm going to
declare myself a float, so a 32-bit

111
00:05:07,190 --> 00:05:09,700
variable called f, arbitrarily.

112
00:05:09,700 --> 00:05:13,910
And then I'm going to store in it
I don't know, one tenth, so 0.1.

113
00:05:13,910 --> 00:05:16,590
So I'm going to express that as one
divided by 10, which is perfectly

114
00:05:16,590 --> 00:05:17,790
legitimate in C.

115
00:05:17,790 --> 00:05:20,460
>> And then on the second line, I simply
want to print out that value.

116
00:05:20,460 --> 00:05:22,950
So recall that we can use
the familiar printf.

117
00:05:22,950 --> 00:05:25,420
We don't want to use %i for an int.

118
00:05:25,420 --> 00:05:28,360
We want to use %f for a float.

119
00:05:28,360 --> 00:05:33,080
And then I'm going to do backslash n,
close quote, comma, f, semicolon.

120
00:05:33,080 --> 00:05:34,400
>> So here's my program.

121
00:05:34,400 --> 00:05:35,820
There's already one bug.

122
00:05:35,820 --> 00:05:38,640
Does someone for whom this clicked
already want to point at least

123
00:05:38,640 --> 00:05:40,220
one bug I've made?

124
00:05:40,220 --> 00:05:42,470
Yeah?

125
00:05:42,470 --> 00:05:42,800
Yeah.

126
00:05:42,800 --> 00:05:47,860
I forgot "#include " at the
top, they symptom of which if I try to

127
00:05:47,860 --> 00:05:50,490
compile this is going to be that the
compiler is going to yell at me,

128
00:05:50,490 --> 00:05:52,770
saying undefined symbol or
something to that effect.

129
00:05:52,770 --> 00:05:55,360
It doesn't understand something
like printf.

130
00:05:55,360 --> 00:05:59,380
>> So I'm going to do "#include
", save the file.

131
00:05:59,380 --> 00:06:00,400
And now it's in better shape.

132
00:06:00,400 --> 00:06:02,690
But I'm also going to point
out one new detail today.

133
00:06:02,690 --> 00:06:08,620
In addition to specifying place
holders like %f %i %s, you can

134
00:06:08,620 --> 00:06:12,320
sometimes influence the behavior
of that placeholder.

135
00:06:12,320 --> 00:06:15,540
For instance, in the case of a floating
point value, if I only want

136
00:06:15,540 --> 00:06:22,200
to display one decimal place after the
period, I can actually do 0.1f.

137
00:06:22,200 --> 00:06:26,830
So in other words, I separate the f and
the percent sign with 0.1, just

138
00:06:26,830 --> 00:06:30,200
telling printf, you might have a whole
bunch of numbers after the decimal

139
00:06:30,200 --> 00:06:30,930
point for me.

140
00:06:30,930 --> 00:06:32,870
But I only want to see one of them.

141
00:06:32,870 --> 00:06:36,280
>> So I'm going to go ahead now and save
this program, go into my terminal

142
00:06:36,280 --> 00:06:41,870
window, and I'm going to go ahead
and type make float 0, enter.

143
00:06:41,870 --> 00:06:44,930
I see that somewhat cryptic line that
will begin to make more sense as we

144
00:06:44,930 --> 00:06:46,900
tease it apart this week and next.

145
00:06:46,900 --> 00:06:50,480
Now I'm going to go ahead
and run float zero.

146
00:06:50,480 --> 00:06:52,020
And, damn.

147
00:06:52,020 --> 00:06:54,880
>> So there's another bug
here for some reason.

148
00:06:54,880 --> 00:07:02,490
I'm pretty sure that one tenth, or
one divided by 10, is not 0.0.

149
00:07:02,490 --> 00:07:04,590
Maybe I'm just not looking
at enough digits.

150
00:07:04,590 --> 00:07:08,580
So why don't I say two .2 to see two
decimal places instead of just one.

151
00:07:08,580 --> 00:07:11,810
Let me go back to my terminal window
here and hit up a couple of times to

152
00:07:11,810 --> 00:07:12,840
see my history.

153
00:07:12,840 --> 00:07:15,910
Do make float zero again,
and then up again.

154
00:07:15,910 --> 00:07:17,730
And now enter.

155
00:07:17,730 --> 00:07:20,000
>> And now I'm pretty sure this is wrong.

156
00:07:20,000 --> 00:07:23,030
And I could do three and four, and I'm
probably going to keep seeing zeros.

157
00:07:23,030 --> 00:07:24,880
So where is the bug?

158
00:07:24,880 --> 00:07:27,910
One divided by 10 should be 0.1.

159
00:07:27,910 --> 00:07:30,310
Someone want to take a stab at what
the fundamental issue is?

160
00:07:30,310 --> 00:07:32,400
Yeah?

161
00:07:32,400 --> 00:07:33,420
They're both integers.

162
00:07:33,420 --> 00:07:33,920
So what?

163
00:07:33,920 --> 00:07:37,820
So with one divided by 10, that's
what I do in arithmetic.

164
00:07:37,820 --> 00:07:41,185
And I get 0.1.

165
00:07:41,185 --> 00:07:41,660
>> Yeah.

166
00:07:41,660 --> 00:07:43,240
And so it is indeed that issue.

167
00:07:43,240 --> 00:07:46,700
When you take an integer in a computer
and you divide it by another integer,

168
00:07:46,700 --> 00:07:50,430
the computer by default is going to
assume that you want an integer.

169
00:07:50,430 --> 00:07:54,620
The problem though, of course, is
that 0.1 is not an integer.

170
00:07:54,620 --> 00:07:55,680
It's a real number.

171
00:07:55,680 --> 00:07:59,610
And so what the computer does by
default is it just throws away

172
00:07:59,610 --> 00:08:01,070
everything after the decimal point.

173
00:08:01,070 --> 00:08:03,380
It doesn't round down or up per se.

174
00:08:03,380 --> 00:08:06,480
It just throws away everything
after the decimal point.

175
00:08:06,480 --> 00:08:07,430
And now that makes sense.

176
00:08:07,430 --> 00:08:09,740
Because now we're clearly
left with zero.

177
00:08:09,740 --> 00:08:10,250
>> But wait a minute.

178
00:08:10,250 --> 00:08:11,840
I'm not seeing an int zero.

179
00:08:11,840 --> 00:08:14,910
I'm actually seeing 0.00.

180
00:08:14,910 --> 00:08:16,340
So how do I reconcile this now?

181
00:08:16,340 --> 00:08:22,850
If one divided by 10 is zero, but I'm
seeing 0.00, where is it getting

182
00:08:22,850 --> 00:08:24,250
converted back to a real number?

183
00:08:24,250 --> 00:08:25,500
Yeah.

184
00:08:25,500 --> 00:08:29,850

185
00:08:29,850 --> 00:08:30,630
Exactly.

186
00:08:30,630 --> 00:08:35,600
>> So up here in line five, when I actually
store that 0.1, which is then

187
00:08:35,600 --> 00:08:39,549
truncated to zero, inside of a float,
that's effectively equivalent to

188
00:08:39,549 --> 00:08:42,100
storing it not as an int but,
indeed, as a float.

189
00:08:42,100 --> 00:08:46,540
Moreover, I'm then using printf to
explicitly print that number to two

190
00:08:46,540 --> 00:08:49,740
decimal places even though there
might not actually be any.

191
00:08:49,740 --> 00:08:51,020
>> So this kind of sucks, right?

192
00:08:51,020 --> 00:08:53,640
Apparently you can't do math,
at least at this level of

193
00:08:53,640 --> 00:08:55,600
precision, in a computer.

194
00:08:55,600 --> 00:08:56,930
But surely there's a solution.

195
00:08:56,930 --> 00:09:00,410
What's the simplest fix we could maybe
do, even just intuitively here to

196
00:09:00,410 --> 00:09:01,130
solve this?

197
00:09:01,130 --> 00:09:02,380
Yeah?

198
00:09:02,380 --> 00:09:04,700

199
00:09:04,700 --> 00:09:06,574
Turn the integers into--

200
00:09:06,574 --> 00:09:06,976
yeah.

201
00:09:06,976 --> 00:09:10,420
Even if I'm not quite sure what's
really going on here, if it

202
00:09:10,420 --> 00:09:13,440
fundamentally has to do with these both
being ints, well, why don't I

203
00:09:13,440 --> 00:09:18,230
make that 10.0, making this
1.0, resave the file.

204
00:09:18,230 --> 00:09:20,990
Let me go back down to the
bottom and recompile.

205
00:09:20,990 --> 00:09:23,030
Let me now rerun.

206
00:09:23,030 --> 00:09:23,420
And there--

207
00:09:23,420 --> 00:09:27,690
now, I've got my one tenth
represented as 0.10.

208
00:09:27,690 --> 00:09:28,420
>> All right.

209
00:09:28,420 --> 00:09:29,220
So that's not bad.

210
00:09:29,220 --> 00:09:31,730
And let me point out one other way
we could have solved this.

211
00:09:31,730 --> 00:09:35,580
Let me actually roll back in time
to when we had this as one

212
00:09:35,580 --> 00:09:36,680
tenth a moment ago.

213
00:09:36,680 --> 00:09:40,800
And let me go ahead and resave this file
as a different file name, just to

214
00:09:40,800 --> 00:09:41,750
have a little checkpoint.

215
00:09:41,750 --> 00:09:43,450
So that was version one.

216
00:09:43,450 --> 00:09:45,520
And now let me go ahead and
do one more version.

217
00:09:45,520 --> 00:09:48,540
We'll call this version
two zero indexed.

218
00:09:48,540 --> 00:09:51,280
>> And I'm going to instead do
this-- you know what?

219
00:09:51,280 --> 00:09:54,400
Adding dot zero works in this case.

220
00:09:54,400 --> 00:09:56,060
But suppose one were a variable.

221
00:09:56,060 --> 00:09:57,680
Supposed 10 were a variable.

222
00:09:57,680 --> 00:10:00,680
In other words, suppose that I couldn't
just hard-code .0 at the end

223
00:10:00,680 --> 00:10:02,340
of this arithmetic expression.

224
00:10:02,340 --> 00:10:05,820
Well, I can actually do something
in parentheses called casting.

225
00:10:05,820 --> 00:10:11,920
I can cast that integer 10 to a float,
and I can cast that integer one to a

226
00:10:11,920 --> 00:10:12,800
float, as well.

227
00:10:12,800 --> 00:10:17,190
Then the math that's going to be done
is effectively 1.0 divided by 10.0,

228
00:10:17,190 --> 00:10:19,250
the result of which goes
in f as before.

229
00:10:19,250 --> 00:10:26,130
So if I recompile this as make floats
2, and now floats 2, I get the same

230
00:10:26,130 --> 00:10:27,020
answer, as well.

231
00:10:27,020 --> 00:10:29,640
>> So this is a fairly contrived example,
to solve this problem

232
00:10:29,640 --> 00:10:31,400
by introducing casting.

233
00:10:31,400 --> 00:10:34,410
But in general, casting's going to be
a powerful thing, particularly for

234
00:10:34,410 --> 00:10:38,180
problem set two in a week's time, when
you want to convert one data type to

235
00:10:38,180 --> 00:10:41,800
another that at the end of the day
are represented in the same way.

236
00:10:41,800 --> 00:10:44,970
At the end of the day, every single
thing we've talked about thus far is

237
00:10:44,970 --> 00:10:46,710
just ints underneath the hood.

238
00:10:46,710 --> 00:10:48,950
Or if that's too low-level for
you, they're just numbers

239
00:10:48,950 --> 00:10:49,750
underneath the hood.

240
00:10:49,750 --> 00:10:52,850
Even characters, again, recall
from week zero, are numbers

241
00:10:52,850 --> 00:10:53,990
underneath the hood.

242
00:10:53,990 --> 00:10:57,240
>> Which is to say, we can convert between
different types of numbers if

243
00:10:57,240 --> 00:10:58,060
they're just bits.

244
00:10:58,060 --> 00:11:01,020
We can convert between numbers
and letters if they're just

245
00:11:01,020 --> 00:11:02,580
bits, and vice versa.

246
00:11:02,580 --> 00:11:07,170
And casting in this way is a mechanism
in programming that lets you forcibly

247
00:11:07,170 --> 00:11:10,970
change one data type to another.

248
00:11:10,970 --> 00:11:14,570
Unfortunately, this isn't as
straightforward as I might have liked.

249
00:11:14,570 --> 00:11:19,220
>> I'm going to go back into floats
1, which was the simpler, more

250
00:11:19,220 --> 00:11:22,830
straightforward one with
.0 added on to each.

251
00:11:22,830 --> 00:11:25,260
And just as a quick refresher,
let me go ahead and recompile

252
00:11:25,260 --> 00:11:27,670
this, make floats 2--

253
00:11:27,670 --> 00:11:30,300
sorry, this is make floats 1.

254
00:11:30,300 --> 00:11:32,050
And now let's run floats 1.

255
00:11:32,050 --> 00:11:34,810
And in the bottom, notice
that I indeed get 0.1.

256
00:11:34,810 --> 00:11:36,165
So, problem solved.

257
00:11:36,165 --> 00:11:37,280
>> But not yet.

258
00:11:37,280 --> 00:11:40,000
I'm now going to get a little curious,
and I'm going to go back into my

259
00:11:40,000 --> 00:11:41,620
printf statement and
say, you know what?

260
00:11:41,620 --> 00:11:44,090
I'd like to confirm that this
is really one tenth.

261
00:11:44,090 --> 00:11:47,890
And I'm going to want to see this
to, say, five decimal places.

262
00:11:47,890 --> 00:11:48,570
It's not a problem.

263
00:11:48,570 --> 00:11:52,020
I change the two to a five,
I recompile with make.

264
00:11:52,020 --> 00:11:53,770
I rerun it as floats 1.

265
00:11:53,770 --> 00:11:54,990
Looking pretty good.

266
00:11:54,990 --> 00:11:58,570
My sanity checks might end there, but
I'm getting a little more adventurous.

267
00:11:58,570 --> 00:12:00,330
I'm going to change 0.5 to 0.10.

268
00:12:00,330 --> 00:12:03,440
I want to see 10 digits after
the decimal place.

269
00:12:03,440 --> 00:12:09,060
>> And I'm going to go ahead and recompile
this and rerun floats 1.

270
00:12:09,060 --> 00:12:13,060
I kind of regret having tested this
further because my math is not so

271
00:12:13,060 --> 00:12:14,320
correct anymore, it seems.

272
00:12:14,320 --> 00:12:15,630
But wait a minute, maybe
that's just a fluke.

273
00:12:15,630 --> 00:12:17,810
Maybe the computer is acting
a little bit strange.

274
00:12:17,810 --> 00:12:21,810
Let me go ahead and do 20 decimal points
and reassure myself that I know

275
00:12:21,810 --> 00:12:22,540
how to do math.

276
00:12:22,540 --> 00:12:23,460
I know how to program.

277
00:12:23,460 --> 00:12:26,960
Make floats 1, recompile, and damn it.

278
00:12:26,960 --> 00:12:31,110
That is really, really getting
far from the mark.

279
00:12:31,110 --> 00:12:32,490
>> So what's going on here?

280
00:12:32,490 --> 00:12:36,050
Intuitively, based on our assumptions
earlier about the size of data types,

281
00:12:36,050 --> 00:12:38,040
what must be happening here
underneath the hood?

282
00:12:38,040 --> 00:12:39,290
Yeah?

283
00:12:39,290 --> 00:12:43,000

284
00:12:43,000 --> 00:12:43,590
Exactly.

285
00:12:43,590 --> 00:12:46,480
If you want this much precision, and
that's a heck of a lot of precision--

286
00:12:46,480 --> 00:12:48,770
20 numbers after the decimal point.

287
00:12:48,770 --> 00:12:51,990
You can't possibly represent an
arbitrary number unless you have an

288
00:12:51,990 --> 00:12:52,930
arbitrary number of bits.

289
00:12:52,930 --> 00:12:54,190
But we don't.

290
00:12:54,190 --> 00:12:57,200
For a float, we only have 32 bits.

291
00:12:57,200 --> 00:13:02,260
>> So if 32 bits can only be permuted in a
way-- just like our humans on, stage

292
00:13:02,260 --> 00:13:05,780
hands up or down-- in a finite number of
ways, there's only a finite number

293
00:13:05,780 --> 00:13:08,640
of real numbers you can represent
with those bits.

294
00:13:08,640 --> 00:13:10,500
And so the computer eventually
is going to have to

295
00:13:10,500 --> 00:13:11,730
start cutting corners.

296
00:13:11,730 --> 00:13:15,500
The computer can hide those details
from us for a little bit of time.

297
00:13:15,500 --> 00:13:18,880
But if we start poking at the numbers
and looking farther and farther at the

298
00:13:18,880 --> 00:13:23,220
trailing numbers in the whole number,
then we start to see that it's

299
00:13:23,220 --> 00:13:26,480
actually approximating the
idea of one tenth.

300
00:13:26,480 --> 00:13:29,860
>> And so it turns out, tragically, there's
an infinite number of numbers

301
00:13:29,860 --> 00:13:35,060
we cannot represent precisely in a
computer, at least with a finite

302
00:13:35,060 --> 00:13:38,030
number of bits, a finite
amount of RAM.

303
00:13:38,030 --> 00:13:41,210
Now unfortunately, this sometimes
has real-world consequences.

304
00:13:41,210 --> 00:13:45,980
If people don't quite appreciate this
or sort of take for granted the fact

305
00:13:45,980 --> 00:13:48,310
that their computer will just do what
they tell it to do and don't

306
00:13:48,310 --> 00:13:51,430
understand these underlying
representation details--

307
00:13:51,430 --> 00:13:55,290
which, frankly, in some languages are
hidden from the user, unlike in C--

308
00:13:55,290 --> 00:13:56,500
some bad things can happen.

309
00:13:56,500 --> 00:13:58,650
>> And what I thought we'd do
is take a step back.

310
00:13:58,650 --> 00:14:00,420
And this is about an
eight-minute video.

311
00:14:00,420 --> 00:14:04,200
It aired a few years ago, and it gives
insights into actually what can go

312
00:14:04,200 --> 00:14:09,290
wrong when you under-appreciate these
kinds of details in the very all-too

313
00:14:09,290 --> 00:14:10,080
real world.

314
00:14:10,080 --> 00:14:12,965
If we could dim the lights
for a few minutes.

315
00:14:12,965 --> 00:14:14,360
>> SPEAKER 1: We now return to engineering

316
00:14:14,360 --> 00:14:17,160
disasters on Modern Marvels.

317
00:14:17,160 --> 00:14:18,680
>> Computers--

318
00:14:18,680 --> 00:14:21,340
we've all come to accept the
often frustrating problems

319
00:14:21,340 --> 00:14:23,170
that go with them.

320
00:14:23,170 --> 00:14:27,570
Bugs, viruses, and software glitches
are small prices to pay for the

321
00:14:27,570 --> 00:14:28,960
convenience.

322
00:14:28,960 --> 00:14:32,040
But in high-tech and high-speed
military and space program

323
00:14:32,040 --> 00:14:38,650
applications, the smallest problem
can be magnified into disaster.

324
00:14:38,650 --> 00:14:44,480
>> On June 4, 1996, scientists prepared to
launch an unmanned Ariane 5 rocket.

325
00:14:44,480 --> 00:14:48,700
It was carrying scientific satellites
designed to establish precisely how

326
00:14:48,700 --> 00:14:53,250
the Earth's magnetic field interacts
with solar winds.

327
00:14:53,250 --> 00:14:57,540
The rocket was built for the European
Space Agency and lifted off from its

328
00:14:57,540 --> 00:14:59,906
facility on the coast
of French Guiana.

329
00:14:59,906 --> 00:15:03,660
>> JACK GANSSLE: At about 37 seconds into
the flight, they first noticed

330
00:15:03,660 --> 00:15:04,910
something was going wrong.

331
00:15:04,910 --> 00:15:08,130
The nozzles were swiveling in
a way they really shouldn't.

332
00:15:08,130 --> 00:15:12,380
Around 40 seconds into the flight,
clearly the vehicle was in trouble.

333
00:15:12,380 --> 00:15:14,400
And that's when they made a
decision to destroy it.

334
00:15:14,400 --> 00:15:18,520
The range safety officer, with
tremendous guts, pressed the button,

335
00:15:18,520 --> 00:15:23,900
blew up the rocket before it could
become a hazard to public safety.

336
00:15:23,900 --> 00:15:27,810
>> SPEAKER 1: This was the maiden voyage
of the Ariane 5, and its destruction

337
00:15:27,810 --> 00:15:32,020
took place because of a flaw embedded
in the rocket's software.

338
00:15:32,020 --> 00:15:33,980
>> JACK GANSSLE: The problem on the Ariane
was that there was a number

339
00:15:33,980 --> 00:15:36,390
that required 64 bits to express.

340
00:15:36,390 --> 00:15:39,420
And they wanted to convert
to a 16-bit number.

341
00:15:39,420 --> 00:15:43,130
They assumed that the number was never
going to be very big, that most of

342
00:15:43,130 --> 00:15:46,810
those digits in the 64-bit
number were zeros.

343
00:15:46,810 --> 00:15:48,270
They were wrong.

344
00:15:48,270 --> 00:15:51,380
>> SPEAKER 1: The inability of one software
program to accept the kind of

345
00:15:51,380 --> 00:15:55,350
number generated by another was
at the root of the failure.

346
00:15:55,350 --> 00:15:59,970
Software development had become a very
costly part of new technology.

347
00:15:59,970 --> 00:16:03,980
The Ariane 4 rocket had been very
successful, so much of the software

348
00:16:03,980 --> 00:16:07,480
created for it was also
used in the Ariane 5.

349
00:16:07,480 --> 00:16:11,980
>> PHILIP COYLE: The basic problem was
that the Ariane 5 was faster,

350
00:16:11,980 --> 00:16:13,720
accelerated faster.

351
00:16:13,720 --> 00:16:17,250
And the software hadn't
accounted for that.

352
00:16:17,250 --> 00:16:20,770
>> SPEAKER 1: The destruction of the rocket
was a huge financial disaster,

353
00:16:20,770 --> 00:16:24,200
all due to a minute software error.

354
00:16:24,200 --> 00:16:27,820
But this wasn't the first time data
conversion problems had plagued modern

355
00:16:27,820 --> 00:16:30,620
rocket technology.

356
00:16:30,620 --> 00:16:34,480
>> JACK GANSSLE: In 1991, with the start
of the first Gulf War, the Patriot

357
00:16:34,480 --> 00:16:38,610
missile experienced a similar kind
of a number conversion problem.

358
00:16:38,610 --> 00:16:44,910
As a result, 28 American soldiers were
killed and about 100 others wounded

359
00:16:44,910 --> 00:16:48,600
when the Patriot, which was supposed
to protect against incoming Scuds,

360
00:16:48,600 --> 00:16:51,630
failed to fire a missile.

361
00:16:51,630 --> 00:16:55,110
>> SPEAKER 1: When Iraq invaded Kuwait and
America launched Desert Storm in

362
00:16:55,110 --> 00:17:00,570
early 1991, Patriot missile batteries
were deployed to protect Saudi Arabia

363
00:17:00,570 --> 00:17:04,760
and Israel from Iraqi Scud
missile attacks.

364
00:17:04,760 --> 00:17:09,720
The Patriot is a US medium-range
surface-to-air system manufactured by

365
00:17:09,720 --> 00:17:11,569
the Raytheon company.

366
00:17:11,569 --> 00:17:16,410
>> THEODORE POSTOL: The size of the Patriot
interceptor itself is roughly

367
00:17:16,410 --> 00:17:17,710
20-feet long.

368
00:17:17,710 --> 00:17:20,800
And it weighs about 2000 pounds.

369
00:17:20,800 --> 00:17:22,940
And it carries a warhead of about--

370
00:17:22,940 --> 00:17:24,905
I think it's roughly 150 pounds.

371
00:17:24,905 --> 00:17:31,030
And the warhead itself is a
high explosive which has

372
00:17:31,030 --> 00:17:33,270
fragments around it.

373
00:17:33,270 --> 00:17:37,490
The casing of the warhead is designed
to act like buckshot.

374
00:17:37,490 --> 00:17:40,720
>> SPEAKER 1: The missiles are carried four
per container and are transported

375
00:17:40,720 --> 00:17:43,050
by a semi trailer.

376
00:17:43,050 --> 00:17:47,490
>> PHILIP COYLE: The Patriot anti-missile
system goes back at

377
00:17:47,490 --> 00:17:50,710
least 20 years now.

378
00:17:50,710 --> 00:17:54,350
It was originally designed as
an air defense missile to

379
00:17:54,350 --> 00:17:56,190
shoot down enemy airplanes.

380
00:17:56,190 --> 00:18:02,490
In the first Gulf War, when that war
came along, the Army wanted to use it

381
00:18:02,490 --> 00:18:05,535
to shoot down Scuds, not airplanes.

382
00:18:05,535 --> 00:18:09,310
The Iraqi air force was not
so much of a problem.

383
00:18:09,310 --> 00:18:12,450
But the Army was worried about Scuds.

384
00:18:12,450 --> 00:18:15,950
And so they tried to upgrade
the Patriot.

385
00:18:15,950 --> 00:18:18,750
>> SPEAKER 1: Intercepting an enemy missile
traveling at mach five was

386
00:18:18,750 --> 00:18:20,890
going to be challenging enough.

387
00:18:20,890 --> 00:18:25,590
But when the Patriot was rushed into
service, the Army was not aware of an

388
00:18:25,590 --> 00:18:31,710
Iraqi modification that made their
Scuds nearly impossible to hit.

389
00:18:31,710 --> 00:18:35,240
>> THEODORE POSTOL: What happened
is the Scuds that were

390
00:18:35,240 --> 00:18:36,570
coming in were unstable.

391
00:18:36,570 --> 00:18:37,532
They were wobbling.

392
00:18:37,532 --> 00:18:43,220
The reason for this was the Iraqis, in
order to get 600 kilometers out of a

393
00:18:43,220 --> 00:18:47,530
300-kilometer-range missile, took
weight out of the front warhead.

394
00:18:47,530 --> 00:18:49,290
They made the warhead lighter.

395
00:18:49,290 --> 00:18:53,110
So now the Patriot's trying
to come at the Scud.

396
00:18:53,110 --> 00:18:56,470
And most of the time, the overwhelming
majority of the time, it would just

397
00:18:56,470 --> 00:18:58,730
fly by the Scud.

398
00:18:58,730 --> 00:19:01,760
>> SPEAKER 1: Once the Patriot system
operators realized the Patriot missed

399
00:19:01,760 --> 00:19:06,690
its target, they detonated the Patriots
warhead to avoid possible

400
00:19:06,690 --> 00:19:10,300
casualties if it was allowed
to fall to the ground.

401
00:19:10,300 --> 00:19:14,540
>> THEODORE POSTOL: That was what most
people saw as big fireballs in the sky

402
00:19:14,540 --> 00:19:20,350
and misunderstood as intercepts
of Scud warheads.

403
00:19:20,350 --> 00:19:23,320
>> SPEAKER 1: Although in the night skies
Patriots appeared to be successfully

404
00:19:23,320 --> 00:19:27,530
destroying Scuds, at Dhahran there
could be no mistake about its

405
00:19:27,530 --> 00:19:29,140
performance.

406
00:19:29,140 --> 00:19:34,180
There, the Patriot's radar system lost
track of an incoming Scud and never

407
00:19:34,180 --> 00:19:36,380
launched due to a software flaw.

408
00:19:36,380 --> 00:19:39,890

409
00:19:39,890 --> 00:19:42,700
>> It was the Israelis who first discovered
that the longer the system

410
00:19:42,700 --> 00:19:48,020
was on, the greater the time discrepancy
became due to a clock

411
00:19:48,020 --> 00:19:50,470
embedded in the system's computer.

412
00:19:50,470 --> 00:19:54,640
>> JACK GANSSLE: About two weeks before the
tragedy in Dhahran, the Israelis

413
00:19:54,640 --> 00:19:58,440
reported to the Defense Department
that the system was losing time.

414
00:19:58,440 --> 00:20:01,280
After about eight hours of running,
they noticed that the system is

415
00:20:01,280 --> 00:20:03,530
becoming noticeably less accurate.

416
00:20:03,530 --> 00:20:07,710
The Defense Department responded by
telling all of the Patriot batteries

417
00:20:07,710 --> 00:20:10,500
to not leave the systems
on for a long time.

418
00:20:10,500 --> 00:20:12,430
They never said what a long time was.

419
00:20:12,430 --> 00:20:13,330
Eight hours?

420
00:20:13,330 --> 00:20:13,810
10 hours?

421
00:20:13,810 --> 00:20:14,990
1,000 hours?

422
00:20:14,990 --> 00:20:17,150
Nobody knew.

423
00:20:17,150 --> 00:20:20,220
>> SPEAKER 1: The Patriot battery stationed
at the barracks at Dhahran

424
00:20:20,220 --> 00:20:24,660
and its flawed internal clock had been
on over 100 hours on the night of

425
00:20:24,660 --> 00:20:27,470
February 25th.

426
00:20:27,470 --> 00:20:31,770
>> JACK GANSSLE: It tracked time to an
accuracy of about a tenth of a second.

427
00:20:31,770 --> 00:20:34,480
Now, a tenth of a second is an
interesting number because it can't be

428
00:20:34,480 --> 00:20:39,940
expressed in binary exactly, which means
it can't be expressed exactly in

429
00:20:39,940 --> 00:20:42,500
any modern digital computer.

430
00:20:42,500 --> 00:20:46,920
It's hard to believe, but
use this as an example.

431
00:20:46,920 --> 00:20:49,000
Let's take the number one third.

432
00:20:49,000 --> 00:20:53,150
One third cannot be expressed
in decimal exactly.

433
00:20:53,150 --> 00:20:57,500
One third is 0.333 going
on for infinity.

434
00:20:57,500 --> 00:21:02,270
There's no way to do that with
absolute accuracy in decimal.

435
00:21:02,270 --> 00:21:05,370
That's exactly the same kind of problem
that happened in the Patriot.

436
00:21:05,370 --> 00:21:09,880
The longer the system ran, the
worst the time error became.

437
00:21:09,880 --> 00:21:13,840
>> SPEAKER 1: After 100 hours of operation,
the error in time was only

438
00:21:13,840 --> 00:21:16,140
about one third of a second.

439
00:21:16,140 --> 00:21:20,800
But in terms of targeting a missile
traveling at mach five, it resulted in

440
00:21:20,800 --> 00:21:24,410
a tracking error of over 600 meters.

441
00:21:24,410 --> 00:21:27,670
It would be a fatal error for
the soldiers at Dhahran.

442
00:21:27,670 --> 00:21:33,450
>> THEODORE POSTOL: What happened is a
Scud launch was detected by early

443
00:21:33,450 --> 00:21:34,280
warning satellites.

444
00:21:34,280 --> 00:21:38,550
And they knew that the Scud was coming
in their general direction.

445
00:21:38,550 --> 00:21:41,000
They didn't know where it was coming.

446
00:21:41,000 --> 00:21:43,900
>> SPEAKER 1: It was now up to the radar
component of the Patriot system

447
00:21:43,900 --> 00:21:48,910
defending Dhahran to locate and keep
track of the incoming enemy missile.

448
00:21:48,910 --> 00:21:50,580
>> JACK GANSSLE: The radar
was very smart.

449
00:21:50,580 --> 00:21:53,770
It would actually track the position of
the Scud and then predict where it

450
00:21:53,770 --> 00:21:57,160
probably would be the next time
the radar sent a pulse out.

451
00:21:57,160 --> 00:21:58,870
That was called the range gate.

452
00:21:58,870 --> 00:22:04,020
>> THEODORE POSTOL: Then once the Patriot
decides enough time has passed to go

453
00:22:04,020 --> 00:22:09,420
back and check the next location for
this detected object, it goes back.

454
00:22:09,420 --> 00:22:14,450
So when it went back to the wrong
place, it then sees no object.

455
00:22:14,450 --> 00:22:18,200
And it decides that there was no object,
it was a false detection, and

456
00:22:18,200 --> 00:22:19,680
drops the track.

457
00:22:19,680 --> 00:22:22,970
>> SPEAKER 1: The incoming Scud disappeared
from the radar screen, and

458
00:22:22,970 --> 00:22:26,050
seconds later it slammed
into the barracks.

459
00:22:26,050 --> 00:22:31,950
The Scud killed 28 and was the last one
fired during the first Gulf War.

460
00:22:31,950 --> 00:22:37,700
Tragically, the updated software arrived
at Dhahran the following day.

461
00:22:37,700 --> 00:22:41,800
The software flaw had been fixed,
closing one chapter in the troubled

462
00:22:41,800 --> 00:22:43,690
history of the Patriot missile.

463
00:22:43,690 --> 00:22:46,780

464
00:22:46,780 --> 00:22:50,710
>> Patriot is actually an acronym
for Phased Array TRacking

465
00:22:50,710 --> 00:22:51,960
Intercept Of Target.

466
00:22:51,960 --> 00:22:54,650

467
00:22:54,650 --> 00:23:00,840
>> DAVID J MALAN: All right, so a
sobering example, to be sure.

468
00:23:00,840 --> 00:23:03,430
And fortunately, these lower level
bugs are not something that we'll

469
00:23:03,430 --> 00:23:06,220
typically have to appreciate, certainly
not with some of our

470
00:23:06,220 --> 00:23:07,360
earliest of programs.

471
00:23:07,360 --> 00:23:10,450
Rather, most of the bugs you'll
encounter will be logical in nature,

472
00:23:10,450 --> 00:23:12,900
syntactic in nature whereby the
code just doesn't work right.

473
00:23:12,900 --> 00:23:14,140
And you know it pretty fast.

474
00:23:14,140 --> 00:23:16,850
>> But particularly when we get to the
end of the semester, it's going to

475
00:23:16,850 --> 00:23:20,620
become more and more of a possibility to
really think hard about the design

476
00:23:20,620 --> 00:23:22,960
of your programs and the underlying
representation

477
00:23:22,960 --> 00:23:24,520
there, too, of the data.

478
00:23:24,520 --> 00:23:28,010
For instance, we'll introduce MySQL,
which is a popular database engine

479
00:23:28,010 --> 00:23:30,850
that you can use with websites to
store data on the back end.

480
00:23:30,850 --> 00:23:34,630
And you'll have to start to decide at
the end of the semester not only what

481
00:23:34,630 --> 00:23:38,790
types of data along these lines to use
but exactly how many bits to use,

482
00:23:38,790 --> 00:23:42,740
whether or not you want to store dates
as dates and times as times, and also

483
00:23:42,740 --> 00:23:46,890
things like how big do you want the
unique IDs to be for, say, the users

484
00:23:46,890 --> 00:23:47,680
in your database.

485
00:23:47,680 --> 00:23:51,210
>> In fact, if some of you have had
Facebook accounts for quite some time,

486
00:23:51,210 --> 00:23:53,680
and you know how to get access
to your User ID--

487
00:23:53,680 --> 00:23:57,930
which sometimes shows up in your
profile's URL unless you've chosen a

488
00:23:57,930 --> 00:24:02,070
nickname for the URL, or if you've
used Facebook's Graph API, the

489
00:24:02,070 --> 00:24:05,510
publicly available API by which you
can ask Facebook for raw data--

490
00:24:05,510 --> 00:24:07,580
you can see what your numeric ID is.

491
00:24:07,580 --> 00:24:10,880
And some years ago, Facebook essentially
had to change from using

492
00:24:10,880 --> 00:24:15,980
the equivalent of ints to using long
long because over time as users come

493
00:24:15,980 --> 00:24:19,780
and go and create lots of accounts and
fake accounts, even they very easily

494
00:24:19,780 --> 00:24:24,630
were able to exhaust something like a 4
billion possible value like an int.

495
00:24:24,630 --> 00:24:28,340
>> So more on those kinds of issues
down the road, as well.

496
00:24:28,340 --> 00:24:30,750
All right, so that was casting.

497
00:24:30,750 --> 00:24:31,670
That was imprecision.

498
00:24:31,670 --> 00:24:32,730
A couple of quick announcements.

499
00:24:32,730 --> 00:24:35,710
So sections formally begin this coming
Sunday, Monday, Tuesday.

500
00:24:35,710 --> 00:24:39,080
You'll hear via email later this week
as to your section assignment.

501
00:24:39,080 --> 00:24:42,570
And you'll also here at that point how
to change your section if your

502
00:24:42,570 --> 00:24:45,660
schedule has now changed or your
comfort level has now changed.

503
00:24:45,660 --> 00:24:49,380
Meanwhile P-set one and hacker one are
due this Thursday with the option to

504
00:24:49,380 --> 00:24:52,450
extend that deadline per the
specifications to Friday

505
00:24:52,450 --> 00:24:53,830
in a typical way.

506
00:24:53,830 --> 00:24:57,500
>> Realize that included with the problem
set specifications are instructions on

507
00:24:57,500 --> 00:25:02,770
how to use the CS50 appliance, make,
as well as some CS50 specific tools

508
00:25:02,770 --> 00:25:06,540
like style 50, which can provide you
with feedback dynamically on the

509
00:25:06,540 --> 00:25:10,230
quality of your code style and also
check 50, which can provide you with

510
00:25:10,230 --> 00:25:13,160
dynamic feedback as to your
code's correctness.

511
00:25:13,160 --> 00:25:16,850
Forgive that we're still ironing
out a few kinks with check 50.

512
00:25:16,850 --> 00:25:21,490
A few of your classmates who did start
around four AM on Friday night when

513
00:25:21,490 --> 00:25:25,130
the spec went up have noticed since then
a few bugs that we are working

514
00:25:25,130 --> 00:25:29,010
through, and apologies for anyone who
has experienced undue frustrations.

515
00:25:29,010 --> 00:25:30,340
The fault is mine.

516
00:25:30,340 --> 00:25:34,080
But we'll follow up on the CS50
discuss when that is resolved.

517
00:25:34,080 --> 00:25:35,700
>> So a word on scores themselves.

518
00:25:35,700 --> 00:25:38,990
So it'll be a week or two before you
start to get feedback on problem sets

519
00:25:38,990 --> 00:25:40,640
because you don't yet have
a teaching fellow.

520
00:25:40,640 --> 00:25:44,510
And even then, we will start to evaluate
the C problem sets before we

521
00:25:44,510 --> 00:25:46,970
go back and evaluate scratch so
that you get more relevant

522
00:25:46,970 --> 00:25:48,150
feedback more quickly.

523
00:25:48,150 --> 00:25:51,870
But in general per the syllabus, CS50
problem sets are evaluated along the

524
00:25:51,870 --> 00:25:53,580
following four axes--

525
00:25:53,580 --> 00:25:55,760
scope, correctness, design, and style.

526
00:25:55,760 --> 00:25:59,210
>> Scope is going to be a number typically
between zero and five that

527
00:25:59,210 --> 00:26:01,830
captures how much of the
piece that you bit off.

528
00:26:01,830 --> 00:26:03,750
Typically, you want this to be five.

529
00:26:03,750 --> 00:26:05,300
You at least tried everything.

530
00:26:05,300 --> 00:26:09,330
And notice it's a multiplicative factor
so that doing only part of the

531
00:26:09,330 --> 00:26:12,520
problem set is not the best strategy.

532
00:26:12,520 --> 00:26:15,610
>> Meanwhile, more obvious is the
importance of correctness--

533
00:26:15,610 --> 00:26:18,620
just is your program correct with
respect to the specification?

534
00:26:18,620 --> 00:26:21,510
This is weighted deliberately more
heavily than the other two axes by a

535
00:26:21,510 --> 00:26:24,450
factor of three because we recognize
that typically you're going to spend a

536
00:26:24,450 --> 00:26:28,600
lot more time chasing down some bugs,
getting your code to work, then you

537
00:26:28,600 --> 00:26:31,540
are indenting it and choosing
appropriate variable names and the

538
00:26:31,540 --> 00:26:33,800
like, which is on the other end
of the spectrum of style.

539
00:26:33,800 --> 00:26:36,160
>> That's not to say style is not
important, and we'll preach it over

540
00:26:36,160 --> 00:26:37,920
time both in lectures and in sections.

541
00:26:37,920 --> 00:26:40,520
Style refers to the aesthetics
of your code.

542
00:26:40,520 --> 00:26:43,980
Have you chosen well-named variables
that are short but somewhat

543
00:26:43,980 --> 00:26:44,680
descriptive?

544
00:26:44,680 --> 00:26:47,840
Is your code indented as you've seen in
lecture and in a manner consistent

545
00:26:47,840 --> 00:26:49,070
with style 50?

546
00:26:49,070 --> 00:26:51,220
>> Lastly is design right
there in the middle.

547
00:26:51,220 --> 00:26:54,090
Design is the harder one to put a
finger on because it's much more

548
00:26:54,090 --> 00:26:55,000
subjective.

549
00:26:55,000 --> 00:26:58,610
But it's perhaps the most important of
the three axes in terms of pedagogical

550
00:26:58,610 --> 00:27:02,050
value over time and that this will be
the teaching fellow's opportunity to

551
00:27:02,050 --> 00:27:04,110
provide you with qualitative feedback.

552
00:27:04,110 --> 00:27:08,100
Indeed, in CS50 even though we do have
these formulas and scores, at the end

553
00:27:08,100 --> 00:27:11,350
of the day these are very deliberately
very small buckets-- point values

554
00:27:11,350 --> 00:27:13,460
between zero and three
and zero and five.

555
00:27:13,460 --> 00:27:17,800
We don't try to draw very coarse lines
between problem sets or between

556
00:27:17,800 --> 00:27:21,490
students but rather focus as much as
we can on qualitative, longhand

557
00:27:21,490 --> 00:27:25,490
feedback, either typed or verbal from
your particular teaching fellow,

558
00:27:25,490 --> 00:27:27,050
you'll get to know quite well.

559
00:27:27,050 --> 00:27:32,340
>> But in general, those are the weights
that the various axes will have.

560
00:27:32,340 --> 00:27:35,480
Meanwhile, too, it's worth keeping in
mind that you should not assume that a

561
00:27:35,480 --> 00:27:38,870
three out of five is a 60% and
therefore roughly failing.

562
00:27:38,870 --> 00:27:41,410
Three is deliberately meant to be
sort of middle of the road good.

563
00:27:41,410 --> 00:27:43,480
If you're getting threes at the
beginning of the semester, that's

564
00:27:43,480 --> 00:27:46,340
indeed meant to be a good
place to begin.

565
00:27:46,340 --> 00:27:50,510
If you're getting twos, fairs, there's
definitely some work to pay a little

566
00:27:50,510 --> 00:27:53,250
more attention, to take advantage
of sections and office hours.

567
00:27:53,250 --> 00:27:54,590
>> If you're getting fours
and fives, great.

568
00:27:54,590 --> 00:27:57,430
But really, we hope to see trajectories
among students-- very

569
00:27:57,430 --> 00:28:00,575
individualized per student, but starting
the semester here in sort of

570
00:28:00,575 --> 00:28:04,100
the two to the three range but ending
up here in the four to five range.

571
00:28:04,100 --> 00:28:05,440
That's what we're really looking for.

572
00:28:05,440 --> 00:28:09,590
And we do keep in mind the delta that
you exhibit between week zero and week

573
00:28:09,590 --> 00:28:12,170
12 when I'm doing grades.

574
00:28:12,170 --> 00:28:16,380
It doesn't matter to us absolutely how
you fair at the beginning if your

575
00:28:16,380 --> 00:28:19,330
trajectory is indeed
upward and strong.

576
00:28:19,330 --> 00:28:24,000
>> Academic honesty-- so let me put on my
more serious voice for just a moment.

577
00:28:24,000 --> 00:28:28,510
So this course has the distinction of
sending more students than any other

578
00:28:28,510 --> 00:28:30,950
in history to the ad board, I believe.

579
00:28:30,950 --> 00:28:34,220
We have sort of lost count at this
point of how often this happens.

580
00:28:34,220 --> 00:28:37,090
And that's not because students in 50
are any more dishonest than their

581
00:28:37,090 --> 00:28:38,690
classmates elsewhere.

582
00:28:38,690 --> 00:28:42,800
But realize, too, that we are very good
at detecting this sort of thing.

583
00:28:42,800 --> 00:28:45,920
>> And that is the advantage that a
computer science class has in that we

584
00:28:45,920 --> 00:28:49,110
can and we do compare all students
problem sets pair-wise against every

585
00:28:49,110 --> 00:28:51,470
other, not only this year
but all prior years.

586
00:28:51,470 --> 00:28:55,080
We have the ability, like students in
the class, to Google and to find code

587
00:28:55,080 --> 00:28:57,440
on sites like github and
discussion forums.

588
00:28:57,440 --> 00:29:00,840
There are absolutely solutions to CS50's
p-sets floating around there.

589
00:29:00,840 --> 00:29:02,710
But if you can find them,
we can find them.

590
00:29:02,710 --> 00:29:07,130
And all of this is very much automated
and easy and sad for us to find.

591
00:29:07,130 --> 00:29:10,990
>> But I want to emphasize, too, that the
course's academic honesty policy is

592
00:29:10,990 --> 00:29:13,960
very much meant to be very much
the opposite of that spirit.

593
00:29:13,960 --> 00:29:17,506
Indeed, this year we've rephrased things
in the syllabus to be this, dot

594
00:29:17,506 --> 00:29:19,790
dot dot, with more detail
in the syllabus.

595
00:29:19,790 --> 00:29:22,860
But the overarching theme in the course
really is to be reasonable.

596
00:29:22,860 --> 00:29:26,160
We recognize that there is a significant
amount of pedagogical

597
00:29:26,160 --> 00:29:30,550
value in collaborating, to some extent,
with classmates, whereby you

598
00:29:30,550 --> 00:29:33,700
two or you three or you more are
standing at a white board

599
00:29:33,700 --> 00:29:35,670
whiteboarding, so to
speak, your ideas--

600
00:29:35,670 --> 00:29:39,480
writing out pseudocode in pictures,
diagramming what should Mario be if

601
00:29:39,480 --> 00:29:41,350
you were to write it first
in pseudocode.

602
00:29:41,350 --> 00:29:43,240
What should the greedy algorithm--

603
00:29:43,240 --> 00:29:46,100
how should it behave per
problem sets one?

604
00:29:46,100 --> 00:29:50,440
>> And so realize that behavior
that we encourage is very

605
00:29:50,440 --> 00:29:51,470
much along those lines.

606
00:29:51,470 --> 00:29:53,890
And in the syllabus, you'll see a
whole bunch of bullets under a

607
00:29:53,890 --> 00:29:57,740
reasonable category and a not reasonable
category that helps us help

608
00:29:57,740 --> 00:30:00,740
you wrap your mind around where
we do draw that line.

609
00:30:00,740 --> 00:30:04,340
And in general, a decent rule of thumb
is that if you are struggling to solve

610
00:30:04,340 --> 00:30:07,990
some bug and your friend or classmate
is sitting next to you, it is

611
00:30:07,990 --> 00:30:11,530
reasonable for you to show him or her
your code and say, hey, can you help

612
00:30:11,530 --> 00:30:13,700
me figure out what's going wrong here?

613
00:30:13,700 --> 00:30:17,110
>> We don't typically embrace
the opposite side.

614
00:30:17,110 --> 00:30:20,730
It is not a correct response for your
friend or classmate here to say, oh,

615
00:30:20,730 --> 00:30:22,510
just look at mine and figure
it out from that.

616
00:30:22,510 --> 00:30:24,400
That is sort of unreasonable.

617
00:30:24,400 --> 00:30:27,750
But having someone else, another brain,
another pair of eyes look at

618
00:30:27,750 --> 00:30:31,620
your screen or look at your code
and say, are you sure you want

619
00:30:31,620 --> 00:30:32,760
to have a loop here?

620
00:30:32,760 --> 00:30:34,800
Or are you sure you want
that semicolon here?

621
00:30:34,800 --> 00:30:37,090
Or oh, that error message means this.

622
00:30:37,090 --> 00:30:39,580
Those are very reasonable and
encouraged behaviors.

623
00:30:39,580 --> 00:30:44,010
>> The cases to which I was alluding to
earlier boil down to when students are

624
00:30:44,010 --> 00:30:47,350
late at night making poor judgment
decisions and emailing their code to

625
00:30:47,350 --> 00:30:50,130
someone else or just saying,
here, it's in Dropbox or

626
00:30:50,130 --> 00:30:51,610
Googling late at night.

627
00:30:51,610 --> 00:30:54,880
And so I would encourage and beg of you,
if you do have those inevitable

628
00:30:54,880 --> 00:30:58,450
moments of stress, you're bumping up
against the deadline, you have no late

629
00:30:58,450 --> 00:31:01,490
day since it's already Friday at that
point, email the course's heads or

630
00:31:01,490 --> 00:31:02,330
myself directly.

631
00:31:02,330 --> 00:31:04,790
Say, listen, I'm at my
breaking point here.

632
00:31:04,790 --> 00:31:06,660
Let's have a conversation
and figure it out.

633
00:31:06,660 --> 00:31:10,400
Resorting to the web or some other not
reasonable behavior is never the

634
00:31:10,400 --> 00:31:13,070
solution, and too many of your
classmates are no longer here on

635
00:31:13,070 --> 00:31:15,150
campus because of that poor judgment.

636
00:31:15,150 --> 00:31:17,840
But it's very easy to skirt that line.

637
00:31:17,840 --> 00:31:22,950
>> And here is a little picture to cheer
you up from Reddit so that now

638
00:31:22,950 --> 00:31:25,720
everything will be OK.

639
00:31:25,720 --> 00:31:30,210
>> So a quick recap, then,
of where we left off.

640
00:31:30,210 --> 00:31:33,690
So last week, recall that we introduce
conditions, not in Scratch

641
00:31:33,690 --> 00:31:34,880
but in C this time.

642
00:31:34,880 --> 00:31:38,300
And there was some new syntax but
really no new ideas per se.

643
00:31:38,300 --> 00:31:42,630
We had Boolean expressions that we could
or together with two vertical

644
00:31:42,630 --> 00:31:46,490
bars or and together with two
ampersands, saying that both the left

645
00:31:46,490 --> 00:31:49,990
and the right must be true
for this to execute.

646
00:31:49,990 --> 00:31:53,150
Then we had switches, which we looked
at briefly, but I propose are really

647
00:31:53,150 --> 00:31:56,830
just different syntax for achieving the
same kind of goal if you know in

648
00:31:56,830 --> 00:31:59,270
advance what your cases
are going to be.

649
00:31:59,270 --> 00:32:00,160
>> We looked at loops.

650
00:32:00,160 --> 00:32:03,340
A for loop is maybe the most common,
or at least the one that people

651
00:32:03,340 --> 00:32:05,330
typically reach for instinctively.

652
00:32:05,330 --> 00:32:08,240
Even though it looks a little cryptic,
you'll see many, many examples of this

653
00:32:08,240 --> 00:32:11,590
before long, as you have
already late last week.

654
00:32:11,590 --> 00:32:14,280
While loops can similarly
achieve the same thing.

655
00:32:14,280 --> 00:32:17,550
But if you want to do any incrementation
or updating of

656
00:32:17,550 --> 00:32:20,230
variable's values, you have to
do it more manually than the

657
00:32:20,230 --> 00:32:22,440
for loop before allows.

658
00:32:22,440 --> 00:32:25,310
And then there's the do-while loop,
which allows us to do something at

659
00:32:25,310 --> 00:32:28,460
least once while something
else is true.

660
00:32:28,460 --> 00:32:31,550
And this is particularly good for
programs or for games where you want

661
00:32:31,550 --> 00:32:33,810
to prompt the user for something
at least once.

662
00:32:33,810 --> 00:32:37,110
And then if he or she doesn't cooperate,
you might want to do it

663
00:32:37,110 --> 00:32:38,420
again and again.

664
00:32:38,420 --> 00:32:41,270
>> With variables, meanwhile, we had lines
of code like this, which could

665
00:32:41,270 --> 00:32:41,950
be two lines.

666
00:32:41,950 --> 00:32:44,830
You could declare an int called
counter, semicolon.

667
00:32:44,830 --> 00:32:47,660
Or you can just declare and
define it, so to speak.

668
00:32:47,660 --> 00:32:49,950
Give it a value at the same time.

669
00:32:49,950 --> 00:32:51,890
>> And then lastly, we talked
about functions.

670
00:32:51,890 --> 00:32:54,270
And this was a nice example in
the sense that it illustrates

671
00:32:54,270 --> 00:32:55,840
two types of functions.

672
00:32:55,840 --> 00:32:59,030
One is GetString(), which, again,
gets a string from the user.

673
00:32:59,030 --> 00:33:02,040
But GetString() is kind of interesting,
so far as we've used it,

674
00:33:02,040 --> 00:33:05,620
because we've always used it with
something on the left-hand side of an

675
00:33:05,620 --> 00:33:06,600
equal sign.

676
00:33:06,600 --> 00:33:09,830
That is to say that GetString()
returns a value.

677
00:33:09,830 --> 00:33:11,970
It returns, of course, a string.

678
00:33:11,970 --> 00:33:15,130
And then on the left-hand side, we're
simply saving that string inside of a

679
00:33:15,130 --> 00:33:16,580
variable called name.

680
00:33:16,580 --> 00:33:21,100
>> This is different, in a sense, from
printf because printf, at least in our

681
00:33:21,100 --> 00:33:23,540
usage here, does not return anything.

682
00:33:23,540 --> 00:33:24,960
As an aside, it does return something.

683
00:33:24,960 --> 00:33:26,380
We just don't care what it is.

684
00:33:26,380 --> 00:33:29,090
But it does have what's
called a side effect.

685
00:33:29,090 --> 00:33:31,840
And what is that side effect in every
case we've seen thus far?

686
00:33:31,840 --> 00:33:34,720
What does printf do?

687
00:33:34,720 --> 00:33:37,780
It prints something to the screen,
displays text or numbers or something

688
00:33:37,780 --> 00:33:38,380
on the screen.

689
00:33:38,380 --> 00:33:41,170
And that's just considered a side effect
because it's not really handing

690
00:33:41,170 --> 00:33:41,900
it back to me.

691
00:33:41,900 --> 00:33:44,770
It's not an answer inside of
a black box that I can then

692
00:33:44,770 --> 00:33:46,130
reach into and grab.

693
00:33:46,130 --> 00:33:49,160
It's just doing it on its own, much
like Colton was plugged into this

694
00:33:49,160 --> 00:33:52,560
black box last week, and he somehow
magically was drawing on the board

695
00:33:52,560 --> 00:33:54,500
without me actually involved.

696
00:33:54,500 --> 00:33:55,560
That would be a side effect.

697
00:33:55,560 --> 00:33:59,100
But if I actually had to reach back in
here and say, oh, here is the string

698
00:33:59,100 --> 00:34:02,040
from the user, that would
be a return value.

699
00:34:02,040 --> 00:34:05,650
>> And thus far we've only used functions
that other people have written.

700
00:34:05,650 --> 00:34:09,219
But we can actually do these
kinds of things ourselves.

701
00:34:09,219 --> 00:34:12,730
So I'm going to go into the
CS50 appliance again.

702
00:34:12,730 --> 00:34:16,020
Let me close the tab that we
had open a moment ago.

703
00:34:16,020 --> 00:34:18,530
And let me go ahead and
create a new file.

704
00:34:18,530 --> 00:34:22,400
And I'm going to go ahead and
call this one positive.c.

705
00:34:22,400 --> 00:34:24,770
So I want to do something with
positive numbers here.

706
00:34:24,770 --> 00:34:27,219
So I'm going to go ahead and do int--

707
00:34:27,219 --> 00:34:28,000
sorry--

708
00:34:28,000 --> 00:34:31,840
#include .

709
00:34:31,840 --> 00:34:34,280
Let's not make that same
mistake as before.

710
00:34:34,280 --> 00:34:40,020
Int main (void), open curly
brace, closed curly brace.

711
00:34:40,020 --> 00:34:41,639
>> And now I want to do the following.

712
00:34:41,639 --> 00:34:44,600
I want to write a program that
insists that the user gives

713
00:34:44,600 --> 00:34:46,770
me a positive integer.

714
00:34:46,770 --> 00:34:50,969
So there is no GetPositiveInt function
in the CS50 library.

715
00:34:50,969 --> 00:34:52,610
There's only GetInt().

716
00:34:52,610 --> 00:34:55,790
But that's OK because I have the
constructs with which I can impose a

717
00:34:55,790 --> 00:34:59,360
little more constraint on that value.

718
00:34:59,360 --> 00:35:00,990
I could do something like this.

719
00:35:00,990 --> 00:35:02,780
>> So int n--

720
00:35:02,780 --> 00:35:04,920
and if you're typing along, just realize
I'm going to go back and

721
00:35:04,920 --> 00:35:06,430
change some things in a moment--

722
00:35:06,430 --> 00:35:09,960
so int n equals GetInt().

723
00:35:09,960 --> 00:35:11,780
And that's going to put
an int inside of n.

724
00:35:11,780 --> 00:35:13,830
And let me be a more descriptive.

725
00:35:13,830 --> 00:35:23,270
Let me say something like I demand that
you give me a positive integer.

726
00:35:23,270 --> 00:35:23,550
>> All right.

727
00:35:23,550 --> 00:35:25,250
So just a little bit of instructions.

728
00:35:25,250 --> 00:35:26,270
And now what can I do?

729
00:35:26,270 --> 00:35:29,840
Well, I already know from my simple
conditions or branches, just like I

730
00:35:29,840 --> 00:35:36,100
had in Scratch, I could say something
like if n is less than or equal to

731
00:35:36,100 --> 00:35:44,460
zero, then I want to do something
like, that is not positive.

732
00:35:44,460 --> 00:35:45,560
And then I could do--

733
00:35:45,560 --> 00:35:47,310
OK, but I really want to get that int.

734
00:35:47,310 --> 00:35:52,020
So I could go up here and I could kind
of copy this and indent this.

735
00:35:52,020 --> 00:35:52,570
And then, OK.

736
00:35:52,570 --> 00:35:56,990
So if n is less than or
equal to zero do this.

737
00:35:56,990 --> 00:35:58,900
>> Now, what if the user
doesn't cooperate?

738
00:35:58,900 --> 00:36:01,560
Well, then I'm going to
borrow this here.

739
00:36:01,560 --> 00:36:03,130
And then I go in here
and here and here.

740
00:36:03,130 --> 00:36:06,420
So this is clearly not
the solution, right?

741
00:36:06,420 --> 00:36:07,810
Because there's no end in sight.

742
00:36:07,810 --> 00:36:13,100
If I want to demand that the user gives
me a positive integer, I can

743
00:36:13,100 --> 00:36:14,150
actually get the int.

744
00:36:14,150 --> 00:36:15,620
I can then check for that int.

745
00:36:15,620 --> 00:36:18,570
But then I want to check it again and
check it again and check it again.

746
00:36:18,570 --> 00:36:21,680
So obviously, what's the better
construct to be using here?

747
00:36:21,680 --> 00:36:22,840
All right, so some kind of loop.

748
00:36:22,840 --> 00:36:25,430
>> So I'm going to get rid
of almost all of this.

749
00:36:25,430 --> 00:36:27,320
And I want to get this
int at least once.

750
00:36:27,320 --> 00:36:28,890
So I'm going to say do--

751
00:36:28,890 --> 00:36:32,110
and I'll come back to the
while in just a moment--

752
00:36:32,110 --> 00:36:33,050
now, do what?

753
00:36:33,050 --> 00:36:35,860
I'm going to do int n gets GetInt().

754
00:36:35,860 --> 00:36:36,080
OK.

755
00:36:36,080 --> 00:36:37,250
So that's pretty good.

756
00:36:37,250 --> 00:36:39,750
And now how often do
I want to do this?

757
00:36:39,750 --> 00:36:45,770
>> Let me put the printf inside of the loop
so I can demand again and again,

758
00:36:45,770 --> 00:36:46,740
if need be.

759
00:36:46,740 --> 00:36:49,720
And what do I want this
while condition to do?

760
00:36:49,720 --> 00:36:53,870
I want to keep doing this
while what is the case?

761
00:36:53,870 --> 00:36:54,125
Yeah.

762
00:36:54,125 --> 00:36:55,390
N is less than or equal to zero.

763
00:36:55,390 --> 00:36:58,180
So already, we've significantly
cleaned this code up.

764
00:36:58,180 --> 00:37:00,700
We've borrowed a very simple construct--
the do-while loop.

765
00:37:00,700 --> 00:37:04,690
I've stolen just the important lines
of code that I started copying and

766
00:37:04,690 --> 00:37:05,960
pasting, which was not wise.

767
00:37:05,960 --> 00:37:09,790
And so now I'm going to actually paste
it in here and just do it once.

768
00:37:09,790 --> 00:37:12,990
>> And now what do I want to do at
the very end of this program?

769
00:37:12,990 --> 00:37:16,810
I'll just say something simple
like, thanks for the-- and

770
00:37:16,810 --> 00:37:18,980
I'll do %i for int--

771
00:37:18,980 --> 00:37:23,270
backslash n, comma, and then
plug in n, semicolon.

772
00:37:23,270 --> 00:37:23,910
>> All right.

773
00:37:23,910 --> 00:37:27,290
So let's see what happens now
when I run this program.

774
00:37:27,290 --> 00:37:30,600
I'm going to go ahead and
do make positive.

775
00:37:30,600 --> 00:37:30,880
Damn.

776
00:37:30,880 --> 00:37:31,600
A few errors.

777
00:37:31,600 --> 00:37:32,960
So let me scroll back up to the first.

778
00:37:32,960 --> 00:37:34,020
Don't work through them backwards.

779
00:37:34,020 --> 00:37:37,000
Work through them from top down
lest they cascade and only

780
00:37:37,000 --> 00:37:38,630
one thing be wrong.

781
00:37:38,630 --> 00:37:42,532
Implicit declaration of
function GetInt().

782
00:37:42,532 --> 00:37:43,020
Yeah.

783
00:37:43,020 --> 00:37:44,420
So it wasn't enough.

784
00:37:44,420 --> 00:37:46,760
I kind of made the same mistake but
a little different this time.

785
00:37:46,760 --> 00:37:51,940
I need to not only include stdio.h but
also cs50.h, which includes the

786
00:37:51,940 --> 00:37:56,770
so-called declarations of get int, which
teach the appliance, or teaches

787
00:37:56,770 --> 00:37:58,760
C what GetInt() is.

788
00:37:58,760 --> 00:37:59,550
>> So let me resave.

789
00:37:59,550 --> 00:38:02,040
I'm going to ignore the other errors
because I'm going to hope that they're

790
00:38:02,040 --> 00:38:05,210
somehow related to the error
I already fixed.

791
00:38:05,210 --> 00:38:08,710
So let me go ahead and recompile
with make positive, Enter.

792
00:38:08,710 --> 00:38:09,020
Damn.

793
00:38:09,020 --> 00:38:09,985
Three errors, still.

794
00:38:09,985 --> 00:38:12,650
Let me scroll up to the first.

795
00:38:12,650 --> 00:38:14,320
Unused variable n.

796
00:38:14,320 --> 00:38:15,850
We've not seen this before.

797
00:38:15,850 --> 00:38:17,200
And this, too, is a little cryptic.

798
00:38:17,200 --> 00:38:18,850
This is the output of the compiler.

799
00:38:18,850 --> 00:38:23,610
And what that highlighted line
there-- positive.c:9:13--

800
00:38:23,610 --> 00:38:28,960
is saying, it's saying on line nine of
positive.c, at the 13th character,

801
00:38:28,960 --> 00:38:31,510
13th column, you made this mistake.

802
00:38:31,510 --> 00:38:34,230
>> And in particular, it's telling
me unused variable n.

803
00:38:34,230 --> 00:38:35,790
So let's see--

804
00:38:35,790 --> 00:38:37,150
line nine.

805
00:38:37,150 --> 00:38:40,430
I'm using n in the sense that
I'm giving it a value.

806
00:38:40,430 --> 00:38:44,200
But what the compiler doesn't like is
that I'm not seemingly using it.

807
00:38:44,200 --> 00:38:45,560
But wait a minute, I am using it.

808
00:38:45,560 --> 00:38:48,170
In line 11, I'm using it here.

809
00:38:48,170 --> 00:38:52,430
But if I scroll down further
at positive.c:11--

810
00:38:52,430 --> 00:38:56,230
so at line 11, character 12, the
compiler's telling me, use of

811
00:38:56,230 --> 00:38:58,670
undeclared identifier n.

812
00:38:58,670 --> 00:39:02,760
>> So undeclared means I have
not specified it as a

813
00:39:02,760 --> 00:39:04,970
variable with a data type.

814
00:39:04,970 --> 00:39:05,500
But wait a minute.

815
00:39:05,500 --> 00:39:09,150
I did exactly that in line nine.

816
00:39:09,150 --> 00:39:11,100
So someone is really confused here.

817
00:39:11,100 --> 00:39:14,900
It's either me or the compiler because
in line nine, again, I'm declaring an

818
00:39:14,900 --> 00:39:18,650
int n, and I'm assigning it the
return value of GetInt().

819
00:39:18,650 --> 00:39:22,930
Then I'm using that variable n in line
11 and checking if its value is less

820
00:39:22,930 --> 00:39:24,050
than or equal to zero.

821
00:39:24,050 --> 00:39:27,430
But this apparently is
bad and broken why?

822
00:39:27,430 --> 00:39:30,630

823
00:39:30,630 --> 00:39:32,490
Say it again?

824
00:39:32,490 --> 00:39:35,690
>> Ah, I have to declare n before
entering the loop.

825
00:39:35,690 --> 00:39:36,370
But why?

826
00:39:36,370 --> 00:39:39,830
I mean, we just proposed a bit ago that
it's fine to declare variables

827
00:39:39,830 --> 00:39:43,600
all on one line and then
assign them some value.

828
00:39:43,600 --> 00:39:46,790
A global variable-- let's come back
to that idea in just a moment.

829
00:39:46,790 --> 00:39:48,690
Why do you want me to put
it outside of the loop?

830
00:39:48,690 --> 00:40:03,170

831
00:40:03,170 --> 00:40:03,830
It is.

832
00:40:03,830 --> 00:40:06,780
Exactly.

833
00:40:06,780 --> 00:40:09,610
>> So, albeit, somewhat counterintuitive,
let me summarize.

834
00:40:09,610 --> 00:40:13,510
When you declare n inside
of the do block there--

835
00:40:13,510 --> 00:40:16,320
specifically inside of
those curly braces--

836
00:40:16,320 --> 00:40:19,210
that variable n has what's
called a scope--

837
00:40:19,210 --> 00:40:23,210
unrelated to our scoring system in the
course-- but has a scope that's

838
00:40:23,210 --> 00:40:25,190
limited to those curly braces.

839
00:40:25,190 --> 00:40:28,460
In other words, typically if you declare
a variable inside a set of

840
00:40:28,460 --> 00:40:33,370
curly braces, that variable only exists
inside of those curly braces.

841
00:40:33,370 --> 00:40:37,320
So by that logic alone, even though
I've declared n in line nine, it

842
00:40:37,320 --> 00:40:41,910
essentially disappears from scope,
disappears from memory, so to speak,

843
00:40:41,910 --> 00:40:43,370
by the time I hit line 11.

844
00:40:43,370 --> 00:40:47,370
Because line 11, unfortunately, is
outside of those curly braces.

845
00:40:47,370 --> 00:40:51,540
>> So I unfortunately can't fix this by
going back to what I did it before.

846
00:40:51,540 --> 00:40:53,370
You might at first do this.

847
00:40:53,370 --> 00:40:56,370
But what are you now not
doing cyclically?

848
00:40:56,370 --> 00:40:58,260
You're obviously not getting
the int cyclically.

849
00:40:58,260 --> 00:41:01,320
So we can leave the GetInt(), and we
should leave the GetInt() inside the

850
00:41:01,320 --> 00:41:04,420
loop because that's what we want to
pester the user for again and again.

851
00:41:04,420 --> 00:41:08,660
But it does suffice to go
up to line, say, six.

852
00:41:08,660 --> 00:41:10,150
Int n, semicolon.

853
00:41:10,150 --> 00:41:12,990
Don't give it a value yet because
you don't need to just yet.

854
00:41:12,990 --> 00:41:16,220
>> But now down here, notice-- this
would be a very easy mistake.

855
00:41:16,220 --> 00:41:19,440
I don't want to shadow my previous
declaration of n.

856
00:41:19,440 --> 00:41:22,830
I want to use the n that
actually exists.

857
00:41:22,830 --> 00:41:25,780
And so now in line 10,
I assign n a value.

858
00:41:25,780 --> 00:41:28,580
But in line six, I declare n.

859
00:41:28,580 --> 00:41:32,940
And so can I or can I not
use it in line 12 now?

860
00:41:32,940 --> 00:41:37,120
I can because between which curly
braces is n declared now?

861
00:41:37,120 --> 00:41:38,770
The one up here on line five.

862
00:41:38,770 --> 00:41:40,330
To one here on line 14.

863
00:41:40,330 --> 00:41:49,770
So if I now zoom out, save this file, go
back into and run make positive, it

864
00:41:49,770 --> 00:41:50,820
compiled this time.

865
00:41:50,820 --> 00:41:51,940
So that's already progress.

866
00:41:51,940 --> 00:41:53,640
Slash. ./positive, Enter.

867
00:41:53,640 --> 00:41:56,060
>> I demand that you give me
a positive integer.

868
00:41:56,060 --> 00:41:57,750
Negative 1.

869
00:41:57,750 --> 00:41:59,020
Negative 2.

870
00:41:59,020 --> 00:42:00,680
Negative 3.

871
00:42:00,680 --> 00:42:01,760
Zero.

872
00:42:01,760 --> 00:42:03,000
One.

873
00:42:03,000 --> 00:42:05,130
And thanks for the one is
what's now printed.

874
00:42:05,130 --> 00:42:07,400
>> Let me try something else,
out of curiosity.

875
00:42:07,400 --> 00:42:09,600
I'm being told to input an integer.

876
00:42:09,600 --> 00:42:12,870
But what if I instead type in lamb?

877
00:42:12,870 --> 00:42:14,460
So you now see a different prompt--

878
00:42:14,460 --> 00:42:15,350
retry.

879
00:42:15,350 --> 00:42:17,670
But nowhere in my code
did I write retry.

880
00:42:17,670 --> 00:42:22,320
So where, presumably, is this retry
prompt coming from, would you say?

881
00:42:22,320 --> 00:42:23,540
Yeah, from GetInt() itself.

882
00:42:23,540 --> 00:42:26,650
So one of the things CS50's staff does
for you, at least in these first few

883
00:42:26,650 --> 00:42:30,400
weeks, is we have written some amount
of error checking to ensure that if

884
00:42:30,400 --> 00:42:34,260
you call GetInt(), you will at least
get back an int from the user.

885
00:42:34,260 --> 00:42:35,460
You won't get a string.

886
00:42:35,460 --> 00:42:36,440
You won't get a char.

887
00:42:36,440 --> 00:42:39,660
You won't get something
else altogether.

888
00:42:39,660 --> 00:42:40,510
You'll get an int.

889
00:42:40,510 --> 00:42:41,890
>> Now, it might not be positive.

890
00:42:41,890 --> 00:42:42,770
It might not be negative.

891
00:42:42,770 --> 00:42:44,550
We make no guarantees around that.

892
00:42:44,550 --> 00:42:48,960
But we will pester the user to retry,
retry, retry until he or she actually

893
00:42:48,960 --> 00:42:49,810
cooperates.

894
00:42:49,810 --> 00:42:53,085
Similarly, if I do 1.23,
that is not an int.

895
00:42:53,085 --> 00:42:58,400
But if I do type in, say, 50, that
gives me a value that I wanted.

896
00:42:58,400 --> 00:42:59,050
>> All right.

897
00:42:59,050 --> 00:43:01,380
So not bad.

898
00:43:01,380 --> 00:43:04,780
Any questions on what we've just done?

899
00:43:04,780 --> 00:43:07,930
The key takeaway being, to be clear, not
so much the loop, which we've seen

900
00:43:07,930 --> 00:43:10,880
before even though we haven't really
used it, but the issue of scope, where

901
00:43:10,880 --> 00:43:17,045
variables can only be can only be used
within some specified scope.

902
00:43:17,045 --> 00:43:19,830
>> All right, let me address the suggestion
you made earlier, that of a

903
00:43:19,830 --> 00:43:20,860
global variable.

904
00:43:20,860 --> 00:43:24,880
As an aside, it turns out that another
solution to this problem, but

905
00:43:24,880 --> 00:43:28,880
typically an incorrect solution or
a poorly designed solution, is to

906
00:43:28,880 --> 00:43:31,670
declare your variable as what's
called a global variable.

907
00:43:31,670 --> 00:43:34,610
Now I'm kind of violating my definition
of scope because there are

908
00:43:34,610 --> 00:43:37,680
no curly braces at the very top
and the very bottom of a file.

909
00:43:37,680 --> 00:43:40,190
But the implication of that
is that now in line four,

910
00:43:40,190 --> 00:43:41,710
n is a global variable.

911
00:43:41,710 --> 00:43:44,460
And as the name implies, it's
just accessible everywhere.

912
00:43:44,460 --> 00:43:45,790
>> Scratch actually has these.

913
00:43:45,790 --> 00:43:48,650
If you used a variable, you might recall
you had to choose if it's for

914
00:43:48,650 --> 00:43:50,780
this sprite or for all sprites.

915
00:43:50,780 --> 00:43:54,270
Well, all sprites is just the clearer
way of saying global.

916
00:43:54,270 --> 00:43:55,520
Yeah?

917
00:43:55,520 --> 00:44:09,690

918
00:44:09,690 --> 00:44:10,990
Ah, really good question.

919
00:44:10,990 --> 00:44:14,310
>> So recall that in the very first version
of my code, when I incorrectly

920
00:44:14,310 --> 00:44:17,700
declared and defined n in line nine--

921
00:44:17,700 --> 00:44:19,980
I declared it as a variable
and I gave it a value with

922
00:44:19,980 --> 00:44:21,160
the assignment operator--

923
00:44:21,160 --> 00:44:22,520
this gave me two errors.

924
00:44:22,520 --> 00:44:26,560
One, the fact that n wasn't used,
and two, that in line 11

925
00:44:26,560 --> 00:44:27,770
it just wasn't declared.

926
00:44:27,770 --> 00:44:31,120
So the first one I didn't
address at the time.

927
00:44:31,120 --> 00:44:35,130
It is not strictly an error to declare
a variable but not use it.

928
00:44:35,130 --> 00:44:38,540
But one of the things we've done in
the CS50 appliance, deliberately,

929
00:44:38,540 --> 00:44:43,340
pedagogically, is we've cranked up the
expectations of the compiler to make

930
00:44:43,340 --> 00:44:46,970
sure that you're doing things not just
correctly but really correctly.

931
00:44:46,970 --> 00:44:51,520
>> Because if you're declaring a variable
like n and never using it, or using it

932
00:44:51,520 --> 00:44:53,700
correctly, then what
is it doing there?

933
00:44:53,700 --> 00:44:55,650
It truly serves no purpose.

934
00:44:55,650 --> 00:44:58,980
And it's very easy over time, if you
don't configure your own computer in

935
00:44:58,980 --> 00:45:01,960
this way, to just have code that has
little remnants here, remnants there.

936
00:45:01,960 --> 00:45:04,390
And then months later you look back and
you're like, why is this line of

937
00:45:04,390 --> 00:45:05,060
code there?

938
00:45:05,060 --> 00:45:07,940
And if there's no good reason, it
doesn't benefit you or your colleagues

939
00:45:07,940 --> 00:45:10,650
down the road to have to
stumble over it then.

940
00:45:10,650 --> 00:45:12,540
>> As an aside, where is
that coming from?

941
00:45:12,540 --> 00:45:16,410
Well, recall that every time we compile
program, all of this stuff is

942
00:45:16,410 --> 00:45:17,380
being printed.

943
00:45:17,380 --> 00:45:18,350
So we'll come back to this.

944
00:45:18,350 --> 00:45:22,230
But again, make is a utility that
automates the process of compiling by

945
00:45:22,230 --> 00:45:24,830
running the actual compiler
called clang.

946
00:45:24,830 --> 00:45:27,650
This thing, we'll eventually see, has
to do with debugging with a special

947
00:45:27,650 --> 00:45:29,060
program called the debugger.

948
00:45:29,060 --> 00:45:32,150
This has to do with optimizing the
code-- more on that in future.

949
00:45:32,150 --> 00:45:33,620
Std=c99--

950
00:45:33,620 --> 00:45:37,870
this just means use the 1999 version of
C. C's been around even longer than

951
00:45:37,870 --> 00:45:40,830
that, but they made some nice
changes 10 plus years ago.

952
00:45:40,830 --> 00:45:42,690
>> And here's the relevant ones.

953
00:45:42,690 --> 00:45:45,880
We are saying make anything that
previously would have been a warning

954
00:45:45,880 --> 00:45:48,560
an error preventing the student
from compiling.

955
00:45:48,560 --> 00:45:51,400
And wall means do that for a
whole bunch of things, not

956
00:45:51,400 --> 00:45:53,060
just related to variables.

957
00:45:53,060 --> 00:45:54,700
And then let me scroll to
the end of this line.

958
00:45:54,700 --> 00:45:56,430
And this, too, we'll eventually
come back to.

959
00:45:56,430 --> 00:45:59,040
This is obviously the name of
the file I'm compiling.

960
00:45:59,040 --> 00:46:02,160
This recalls the name of the file
I'm outputting as the name

961
00:46:02,160 --> 00:46:04,070
of my runnable program.

962
00:46:04,070 --> 00:46:08,970
This -lcs50 just means use the CS50
library, and any zeros and ones that

963
00:46:08,970 --> 00:46:12,390
the staff wrote and compiled earlier
this year, integrate

964
00:46:12,390 --> 00:46:13,490
them into my program.

965
00:46:13,490 --> 00:46:16,130
>> And anyone know what -lm is?

966
00:46:16,130 --> 00:46:18,150
It's the math library, which is
just there even if you're

967
00:46:18,150 --> 00:46:19,320
not doing any math.

968
00:46:19,320 --> 00:46:22,620
It's just automatically provided
to us by make.

969
00:46:22,620 --> 00:46:26,540
>> Well, let me do one other example
here by opening up a new file.

970
00:46:26,540 --> 00:46:30,560
And let me save this one as string.c.

971
00:46:30,560 --> 00:46:37,980
It turns out that as we talk about data
types today, there's even more

972
00:46:37,980 --> 00:46:40,630
going on underneath the hood
than we've seen thus far.

973
00:46:40,630 --> 00:46:42,290
So let me quickly do a quick program.

974
00:46:42,290 --> 00:46:44,510
Include stdio.h.

975
00:46:44,510 --> 00:46:45,730
And I'll save that.

976
00:46:45,730 --> 00:46:48,110
And you know, let me not make the
same mistake again and again.

977
00:46:48,110 --> 00:46:50,540
Include cs50.h.

978
00:46:50,540 --> 00:46:54,870
And let me go ahead now
and do int main(void).

979
00:46:54,870 --> 00:46:58,790
>> And now I simply want to do a program
that does this-- declare a string

980
00:46:58,790 --> 00:47:03,610
called s and get a string
from the user.

981
00:47:03,610 --> 00:47:05,820
And let me do a little
instructions here--

982
00:47:05,820 --> 00:47:09,960
please give me a string-- so
the user knows what to do.

983
00:47:09,960 --> 00:47:13,190
And then down here below this,
I want to do the following--

984
00:47:13,190 --> 00:47:16,060
for int i gets zero.

985
00:47:16,060 --> 00:47:18,580
Again, computer scientists typically
start counting at zero, but we could

986
00:47:18,580 --> 00:47:20,340
make that one if we really wanted.

987
00:47:20,340 --> 00:47:27,240
Now I'm going to do i is less
than the string length of s.

988
00:47:27,240 --> 00:47:28,430
So strlen--

989
00:47:28,430 --> 00:47:29,510
S-T-R-L-E-N--

990
00:47:29,510 --> 00:47:31,650
again, it's concise because it's easier
to type, even though it's a

991
00:47:31,650 --> 00:47:32,590
little cryptic.

992
00:47:32,590 --> 00:47:35,290
>> That is a function we've not used
before but literally does that--

993
00:47:35,290 --> 00:47:37,810
return to me a number that represents
the length of the string

994
00:47:37,810 --> 00:47:38,690
that the user typed.

995
00:47:38,690 --> 00:47:41,740
If they typed in hello, it would return
five because there's five

996
00:47:41,740 --> 00:47:42,890
letters in hello.

997
00:47:42,890 --> 00:47:45,390
Then, on each iteration of
this loop, i plus plus.

998
00:47:45,390 --> 00:47:49,170
So again, a standard construct even if
you're not quite too comfortable or

999
00:47:49,170 --> 00:47:50,420
familiar with it yet.

1000
00:47:50,420 --> 00:47:53,220
>> But now on each iteration of this loop,
notice what I'm going to do.

1001
00:47:53,220 --> 00:47:56,690
I want to go ahead and print
out a single character--

1002
00:47:56,690 --> 00:47:59,940
so %c backslash n on a new line.

1003
00:47:59,940 --> 00:48:00,990
And then, you know what I want to do?

1004
00:48:00,990 --> 00:48:05,090
Whatever the word is that the user types
in, like hello, I want to print

1005
00:48:05,090 --> 00:48:09,530
H-E-L-L-O, one character per line.

1006
00:48:09,530 --> 00:48:13,080
In other words, I want to get at the
individual characters in a string,

1007
00:48:13,080 --> 00:48:16,770
whereby up until now a string has just
been a sequence of characters.

1008
00:48:16,770 --> 00:48:21,690
>> And it turns out I can do s, bracket,
i, close bracket, close

1009
00:48:21,690 --> 00:48:23,580
parenthesis, semicolon.

1010
00:48:23,580 --> 00:48:25,640
And I do have to do one more thing.

1011
00:48:25,640 --> 00:48:30,570
It's in a file called string.h
that strlen is declared.

1012
00:48:30,570 --> 00:48:33,190
So if I want to use that function,
I need to tell the compiler,

1013
00:48:33,190 --> 00:48:34,450
expect to use it.

1014
00:48:34,450 --> 00:48:37,040
Now let me go ahead and make
the program called string.

1015
00:48:37,040 --> 00:48:39,150
Dot, slash, string.

1016
00:48:39,150 --> 00:48:40,130
>> Please give me a string.

1017
00:48:40,130 --> 00:48:40,900
I'll go ahead and type it.

1018
00:48:40,900 --> 00:48:43,040
Hello, in all caps, Enter.

1019
00:48:43,040 --> 00:48:47,390
And now notice I've printed this
one character after the other.

1020
00:48:47,390 --> 00:48:51,450
So the new detail here is that a string,
at the end of the day, can be

1021
00:48:51,450 --> 00:48:54,810
accessed by way of its individual
characters by introducing the square

1022
00:48:54,810 --> 00:48:55,840
bracket notation.

1023
00:48:55,840 --> 00:48:59,090
And that's because a string underneath
the hood is indeed a sequence of

1024
00:48:59,090 --> 00:48:59,810
characters.

1025
00:48:59,810 --> 00:49:02,010
But what's neat about them is
in your computer's RAM--

1026
00:49:02,010 --> 00:49:05,300
Mac, PC, whatever it is-- they're
literally back to back to back--

1027
00:49:05,300 --> 00:49:06,225
H-E-L-L-O--

1028
00:49:06,225 --> 00:49:09,920
at individual, adjacent
bytes in memory.

1029
00:49:09,920 --> 00:49:13,210
>> So if you want to get at the eighth such
byte, which in this loop would be

1030
00:49:13,210 --> 00:49:16,900
bracket zero, bracket one, bracket two,
bracket three, bracket four--

1031
00:49:16,900 --> 00:49:18,890
that's zero indexed up until five--

1032
00:49:18,890 --> 00:49:23,330
that will print out H-E-L-L-O
on its own line.

1033
00:49:23,330 --> 00:49:26,320
>> Now, as a teaser, let me show you the
sorts of things you'll eventually be

1034
00:49:26,320 --> 00:49:31,950
able to understand, at least
with some close looking.

1035
00:49:31,950 --> 00:49:35,610
For one, what we included in today's
examples, if you'd like, is actually

1036
00:49:35,610 --> 00:49:38,300
one of the very first jailbreaks
for the iPhone.

1037
00:49:38,300 --> 00:49:40,800
Jailbreaking means cracking the phone
so you can actually use it on a

1038
00:49:40,800 --> 00:49:43,380
different carrier or install
your own software.

1039
00:49:43,380 --> 00:49:45,660
And you'll notice this looks completely
cryptic, most likely.

1040
00:49:45,660 --> 00:49:46,520
But look at this.

1041
00:49:46,520 --> 00:49:50,420
The iPhone was apparently cracked with
a for loop, an if condition, an else

1042
00:49:50,420 --> 00:49:52,580
condition, a bunch of functions
we've not seen.

1043
00:49:52,580 --> 00:49:54,230
>> And again, you won't at
first glance probably

1044
00:49:54,230 --> 00:49:55,620
understand how this is working.

1045
00:49:55,620 --> 00:49:58,940
But everything that we sort of take
for granted in our modern lives

1046
00:49:58,940 --> 00:50:02,040
actually tends to reduce even to some
of these fundamentals we've been

1047
00:50:02,040 --> 00:50:02,820
looking at.

1048
00:50:02,820 --> 00:50:06,680
Let me go ahead and open one
other program, holloway.c.

1049
00:50:06,680 --> 00:50:08,970
So this, too, is something you
shouldn't really know.

1050
00:50:08,970 --> 00:50:12,440
Even none of the staff or I could
probably figure this out by looking at

1051
00:50:12,440 --> 00:50:15,450
it because this was someone's code
that was submitted to what's

1052
00:50:15,450 --> 00:50:19,630
historically known as an obfuscated C
contest, where you write a program

1053
00:50:19,630 --> 00:50:24,670
that compiles and runs but is so damn
cryptic no human can understand what

1054
00:50:24,670 --> 00:50:27,530
it's going to do until
they actually run it.

1055
00:50:27,530 --> 00:50:29,940
>> So indeed, if you look at this
code, I see a switch.

1056
00:50:29,940 --> 00:50:30,870
I see main.

1057
00:50:30,870 --> 00:50:33,800
I see these square brackets implying
some kind of an array.

1058
00:50:33,800 --> 00:50:35,970
Does anyone want to guess what
this program actually

1059
00:50:35,970 --> 00:50:37,220
does if I run Holloway?

1060
00:50:37,220 --> 00:50:39,940

1061
00:50:39,940 --> 00:50:40,750
Yes.

1062
00:50:40,750 --> 00:50:43,050
OK.

1063
00:50:43,050 --> 00:50:44,690
Well done.

1064
00:50:44,690 --> 00:50:48,090
So only the staff and I cannot figure
out what these things do.

1065
00:50:48,090 --> 00:50:51,670
>> And now lastly, let me go ahead
and open up one other program.

1066
00:50:51,670 --> 00:50:53,440
This one--

1067
00:50:53,440 --> 00:50:55,550
again, we'll make the source code
available online-- this one's just

1068
00:50:55,550 --> 00:50:57,480
kind of pretty to look at.

1069
00:50:57,480 --> 00:50:59,750
All they did is hit the
space bar quite a bit.

1070
00:50:59,750 --> 00:51:01,320
But this is real code.

1071
00:51:01,320 --> 00:51:04,790
So if you think that's pretty, if we
actually run this at the prompt,

1072
00:51:04,790 --> 00:51:08,970
eventually you'll see how we
might do things like this.

1073
00:51:08,970 --> 00:51:14,008
>> So we'll leave you on that note
and see you on Wednesday.

1074
00:51:14,008 --> 00:51:18,440
>> [MUSIC PLAYING]

1075
00:51:18,440 --> 00:51:23,380
>> SPEAKER 2: At the next CS50,
the TFs stage a mutiny.

1076
00:51:23,380 --> 00:51:24,112
>> SPEAKER 3: There he is.

1077
00:51:24,112 --> 00:51:25,362
Get him!

1078
00:51:25,362 --> 00:51:29,912

1079
00:51:29,912 --> 00:51:32,074
>> [MUSIC PLAYING]