1
00:00:00,000 --> 00:00:03,000
[Review] [Quiz 0]

2
00:00:03,000 --> 00:00:05,000
>> [Lexi Ross, Tommy MacWilliam, Lucas Freitas, Joseph Ong] [Harvard University]

3
00:00:05,000 --> 00:00:08,000
>> [This is CS50.] [CS50.TV]

4
00:00:08,000 --> 00:00:10,000
>> Hey, everyone. 

5
00:00:10,000 --> 00:00:15,000
Welcome to the review session for Quiz 0, which is taking place this Wednesday.

6
00:00:15,000 --> 00:00:19,000
What we're going to do tonight, I'm with 3 other TFs,

7
00:00:19,000 --> 00:00:24,000
and together we're going to go through a review of what we've done in the course so far.

8
00:00:24,000 --> 00:00:27,000
It's not going to be 100% comprehensive, but it should give you a better idea

9
00:00:27,000 --> 00:00:31,000
of what you already have down and what  you still need to study before Wednesday.

10
00:00:31,000 --> 00:00:34,000
And feel free to raise your hand with questions as we're going along,

11
00:00:34,000 --> 00:00:38,000
but keep in mind that we'll also have a little bit of time at the end—

12
00:00:38,000 --> 00:00:41,000
if we get through with a few minutes to spare—to do general questions,

13
00:00:41,000 --> 00:00:47,000
so keep that in mind, and so we're going to start at the beginning with Week 0.

14
00:00:47,000 --> 00:00:50,000
>> [Quiz 0 Review!] [Part 0] [Lexi Ross] But before we do that let's talk about

15
00:00:50,000 --> 00:00:53,000
the logistics of the quiz.

16
00:00:53,000 --> 00:00:55,000
>> [Logistics] [Quiz takes place on Wednesday 10/10 in lieu of lecture] 

17
00:00:55,000 --> 00:00:57,000
>> [(See http://cdn.cs50.net/2012/fall/quizzes/0/about0.pdf for details)] It is on Wednesday, October 10th. 

18
00:00:57,000 --> 00:01:00,000
>> That's this Wednesday, and if you go to this URL here,

19
00:01:00,000 --> 00:01:03,000
which is also accessible from CS50.net—there's a link to it—

20
00:01:03,000 --> 00:01:06,000
you can see information about where to go based on 

21
00:01:06,000 --> 00:01:10,000
your last name or school affiliation as well as

22
00:01:10,000 --> 00:01:14,000
it tells about exactly what the quiz will cover and the types of questions that you're going to get.

23
00:01:14,000 --> 00:01:19,000
Keep in mind that you'll also have a chance to review for the quiz in section,

24
00:01:19,000 --> 00:01:21,000
so your TFs should be going over some practice problems,

25
00:01:21,000 --> 00:01:29,000
and that's another good chance to see where you still need to study up for the quiz.

26
00:01:29,000 --> 00:01:32,000
Let's start at the beginning with Bits 'n' Bytes.

27
00:01:32,000 --> 00:01:35,000
Remember a bit is just a 0 or a 1, 

28
00:01:35,000 --> 00:01:38,000
and a byte is a collection of 8 of those bits.

29
00:01:38,000 --> 00:01:42,000
Let's look at this collection of bits right here.

30
00:01:42,000 --> 00:01:44,000
We should be able to figure out how many bits there are.

31
00:01:44,000 --> 00:01:48,000
Where we count there's just 8 of them, eight 0 or 1 units.

32
00:01:48,000 --> 00:01:51,000
And since there's 8 bits, that's 1 byte,

33
00:01:51,000 --> 00:01:53,000
and let's convert it to hexadecimal. 

34
00:01:53,000 --> 00:01:58,000
Hexadecimal is base 16, and it's pretty easy to convert

35
00:01:58,000 --> 00:02:01,000
a number in binary, which is what that is, to a number in hexadecimal.

36
00:02:01,000 --> 00:02:04,000
All we do is we look at groups of 4, 

37
00:02:04,000 --> 00:02:07,000
and we convert them to the appropriate hexadecimal digit.

38
00:02:07,000 --> 00:02:11,000
We start with the right-most group of 4, so 0011.

39
00:02:11,000 --> 00:02:16,000
That's going to be one 1 and one 2, so together that makes 3.

40
00:02:16,000 --> 00:02:19,000
And then let's look at the other block of 4.

41
00:02:19,000 --> 00:02:24,000
1101. That's going to be one 1, one 4, and one 8.

42
00:02:24,000 --> 00:02:28,000
Together that's going to be 13, which makes D.

43
00:02:28,000 --> 00:02:32,000
And we'll remember that in hexadecimal we don't just go 0 through 9.

44
00:02:32,000 --> 00:02:36,000
We go 0 through F, so after 9, 10 corresponds to A,

45
00:02:36,000 --> 00:02:40,000
11 to B, et cetera where F is 15.

46
00:02:40,000 --> 00:02:44,000
Here 13 is a D,

47
00:02:44,000 --> 00:02:49,000
so to convert it to decimal all we do is we actually 

48
00:02:49,000 --> 00:02:52,000
treat each position as a power of 2.

49
00:02:52,000 --> 00:02:58,000
That's one 1, one 2, zero 4s, zero 8s, one 16, et cetera,

50
00:02:58,000 --> 00:03:03,000
and it's a little hard to compute in your head, but if we go to the next slide

51
00:03:03,000 --> 00:03:05,000
we can see the answer to that. 

52
00:03:05,000 --> 00:03:09,000
>> Essentially we're going across from right back to left,

53
00:03:09,000 --> 00:03:14,000
and we're multiplying each digit by the corresponding power of 2.

54
00:03:14,000 --> 00:03:19,000
And remember, for hexadecimal we denote these numbers with 0x at the beginning

55
00:03:19,000 --> 00:03:23,000
so we don't confuse it with a decimal number.

56
00:03:23,000 --> 00:03:29,000
Continuing on, this is an ASCII Table,

57
00:03:29,000 --> 00:03:35,000
and what we use ASCII for is to map from characters to numerical values.

58
00:03:35,000 --> 00:03:39,000
Remember in the cryptography pset we made extensive use of the ASCII Table

59
00:03:39,000 --> 00:03:43,000
in order to use various methods of cryptography, 

60
00:03:43,000 --> 00:03:47,000
the Caesar and the Vigenère cipher, to convert different letters

61
00:03:47,000 --> 00:03:52,000
in a string according to the key given by the user.

62
00:03:52,000 --> 00:03:56,000
Let's look at a little bit of ASCII math.

63
00:03:56,000 --> 00:04:02,000
Looking at 'P' + 1, in character form that would be Q,

64
00:04:02,000 --> 00:04:07,000
and remember that '5' ≠ 5.

65
00:04:07,000 --> 00:04:10,000
And how exactly would we convert between those 2 forms?

66
00:04:10,000 --> 00:04:13,000
It's not actually too hard.

67
00:04:13,000 --> 00:04:16,000
In order to get 5 we subtract  '0'

68
00:04:16,000 --> 00:04:20,000
because there are 5 places between the '0' and the '5.'

69
00:04:20,000 --> 00:04:23,000
In order to go the other way we just add the 0,

70
00:04:23,000 --> 00:04:25,000
so it's sort of like regular arithmetic. 

71
00:04:25,000 --> 00:04:29,000
Just remember that when something has quotes around it it's a character

72
00:04:29,000 --> 00:04:37,000
and thus corresponds to a value in the ASCII table.

73
00:04:37,000 --> 00:04:40,000
Moving into more general computer science topics.

74
00:04:40,000 --> 00:04:43,000
We learned what an algorithm is and how we use programming

75
00:04:43,000 --> 00:04:45,000
to implement algorithms.

76
00:04:45,000 --> 00:04:48,000
Some examples of algorithms are something really simple like 

77
00:04:48,000 --> 00:04:51,000
checking whether a number is even or odd.

78
00:04:51,000 --> 00:04:54,000
For that remember we mod the number by 2 and check if the result is 0.

79
00:04:54,000 --> 00:04:57,000
If so, it's even. If not, it's odd.

80
00:04:57,000 --> 00:04:59,000
And that's an example of a really basic algorithm.

81
00:04:59,000 --> 00:05:02,000
>> A little bit of a more involved one is binary search,

82
00:05:02,000 --> 00:05:05,000
which we'll go over later in the review session.

83
00:05:05,000 --> 00:05:09,000
And programming is the term we use for taking an algorithm

84
00:05:09,000 --> 00:05:15,000
and converting it to code the computer can read.

85
00:05:15,000 --> 00:05:20,000
2 examples of programming is Scratch,

86
00:05:20,000 --> 00:05:22,000
which is what we did in Week 0.

87
00:05:22,000 --> 00:05:25,000
Even though we don't actually type out the code it's a way of implementing

88
00:05:25,000 --> 00:05:29,000
this algorithm, which is printing the numbers 1-10,

89
00:05:29,000 --> 00:05:32,000
and here we do the same in the C programming language.

90
00:05:32,000 --> 00:05:41,000
These are functionally equivalent, just written in different languages or syntax.

91
00:05:41,000 --> 00:05:44,000
We then learned about boolean expressions,

92
00:05:44,000 --> 00:05:48,000
and a boolean is a value that's either true or false,

93
00:05:48,000 --> 00:05:51,000
and here oftentimes boolean expressions

94
00:05:51,000 --> 00:05:55,000
go inside of conditions, so if (x ≤ 5),

95
00:05:55,000 --> 00:06:00,000
well, we already set x = 5, so that condition is going to evaluate to true.

96
00:06:00,000 --> 00:06:03,000
And if it's true, whatever code is beneath the condition

97
00:06:03,000 --> 00:06:08,000
is going to be evaluated by the computer, so that string is going to be printed

98
00:06:08,000 --> 00:06:12,000
to standard output, and the term condition

99
00:06:12,000 --> 00:06:16,000
refers to whatever is inside the parentheses of the if statement.

100
00:06:16,000 --> 00:06:20,000
Remember all the operators.

101
00:06:20,000 --> 00:06:26,000
Remember it's && and | | when we're trying to combine 2 or more conditions,

102
00:06:26,000 --> 00:06:30,000
== not = to check whether 2 things are equal.

103
00:06:30,000 --> 00:06:36,000
Remember that = is for assignment whereas == is a boolean operator.

104
00:06:36,000 --> 00:06:41,000
≤, ≥ and then the final 2 are self-explanatory.

105
00:06:41,000 --> 00:06:45,000
A general review of boolean logic here. 

106
00:06:45,000 --> 00:06:48,000
And boolean expressions are also important in loops, 

107
00:06:48,000 --> 00:06:50,000
which we'll go over now.

108
00:06:50,000 --> 00:06:56,000
We learned about 3 types of loops so far in CS50, for, while, and do while.

109
00:06:56,000 --> 00:06:59,000
And it's important to know that while for most purposes

110
00:06:59,000 --> 00:07:02,000
we can actually use any type of loop generally

111
00:07:02,000 --> 00:07:06,000
there are certain types of purposes or common patterns

112
00:07:06,000 --> 00:07:09,000
in programming that specifically call for one of these loops

113
00:07:09,000 --> 00:07:13,000
that make it the most efficient or elegant to code it in that way.

114
00:07:13,000 --> 00:07:18,000
Let's go over what each of these loops tends to be used for most often.

115
00:07:18,000 --> 00:07:21,000
>> In a for loop we generally already know how many times we want to iterate.

116
00:07:21,000 --> 00:07:24,000
That's what we put in the condition.

117
00:07:24,000 --> 00:07:28,000
For, i = 0, i < 10, for  example.

118
00:07:28,000 --> 00:07:31,000
We already know that we want to do something 10 times.

119
00:07:31,000 --> 00:07:34,000
Now, for a while loop, generally we don't necessarily 

120
00:07:34,000 --> 00:07:36,000
know how many times we want the loop to run.

121
00:07:36,000 --> 00:07:39,000
But we do know some sort of condition that we want it to

122
00:07:39,000 --> 00:07:41,000
always be true or always be false.

123
00:07:41,000 --> 00:07:44,000
For example, while is set.

124
00:07:44,000 --> 00:07:46,000
Let's say that's a boolean variable.

125
00:07:46,000 --> 00:07:48,000
While that's true we want the code to evaluate, 

126
00:07:48,000 --> 00:07:52,000
so a little bit more extensible, a little bit more general than a for loop,

127
00:07:52,000 --> 00:07:55,000
but any for loop can also be converted to a while loop.

128
00:07:55,000 --> 00:08:00,000
Finally, do while loops, which may be the trickiest to comprehend right away,

129
00:08:00,000 --> 00:08:04,000
are often used when we want to evaluate the code first

130
00:08:04,000 --> 00:08:06,000
before the first time we check the condition.

131
00:08:06,000 --> 00:08:09,000
A common use case for a do while loop

132
00:08:09,000 --> 00:08:12,000
is when you want to get user input, and you know you want to ask the user

133
00:08:12,000 --> 00:08:15,000
for input at least once, but if they don't give you good input right away

134
00:08:15,000 --> 00:08:18,000
you want to keep asking them until they give you the good input.

135
00:08:18,000 --> 00:08:21,000
That's the most common use of a do while loop,

136
00:08:21,000 --> 00:08:23,000
and let's look at the actual structure of these loops.

137
00:08:23,000 --> 00:08:27,000
They typically always tend to follow these patterns.

138
00:08:27,000 --> 00:08:30,000
>> On the for loop inside you have 3 components:

139
00:08:30,000 --> 00:08:35,000
initialization, typically something like int i = 0 where i is the counter, 

140
00:08:35,000 --> 00:08:40,000
condition, where we want to say run this for loop as long as this condition still holds,

141
00:08:40,000 --> 00:08:44,000
like i < 10, and then finally, update, which is how we increment 

142
00:08:44,000 --> 00:08:47,000
the counter variable at each point in the loop.

143
00:08:47,000 --> 00:08:50,000
A common thing to see there is just i++,

144
00:08:50,000 --> 00:08:52,000
which means increment i by 1 every time.

145
00:08:52,000 --> 00:08:55,000
You could also do something like i+ = 2,

146
00:08:55,000 --> 00:08:58,000
which means add 2 to i each time you go through the loop.

147
00:08:58,000 --> 00:09:03,000
And then the do this just refers to any code that actually runs as part of the loop.

148
00:09:03,000 --> 00:09:09,000
And for a while loop, this time we actually have the initialization outside of the loop,

149
00:09:09,000 --> 00:09:12,000
so for example, let's say we're trying to do the same type of loop as I just described.

150
00:09:12,000 --> 00:09:16,000
We would say int i = 0 before the loop begins.

151
00:09:16,000 --> 00:09:20,000
Then we could say while i < 10 do this,

152
00:09:20,000 --> 00:09:22,000
so the same block of code as before,

153
00:09:22,000 --> 00:09:26,000
and this time the update part of the code, for example, i++,

154
00:09:26,000 --> 00:09:29,000
actually goes inside of the loop.

155
00:09:29,000 --> 00:09:33,000
And finally, for a do while, it's similar to the while loop,

156
00:09:33,000 --> 00:09:36,000
but we have to remember that the code will evaluate once

157
00:09:36,000 --> 00:09:40,000
before the condition is checked, so it makes a lot more sense

158
00:09:40,000 --> 00:09:44,000
if you look at it in order of top to bottom.

159
00:09:44,000 --> 00:09:49,000
In a do while loop the code evaluates before you even look at the while condition,

160
00:09:49,000 --> 00:09:55,000
whereas a while loop, it checks first.

161
00:09:55,000 --> 00:09:59,000
Statements and variables.

162
00:09:59,000 --> 00:10:04,000
When we want to create a new variable we first want to initialize it.

163
00:10:04,000 --> 00:10:07,000
>> For example, int bar initializes the variable bar,

164
00:10:07,000 --> 00:10:10,000
but it doesn't give it a value, so what is bar's value now?

165
00:10:10,000 --> 00:10:12,000
We don't know. 

166
00:10:12,000 --> 00:10:14,000
It could be some garbage value that was previously stored in memory there,

167
00:10:14,000 --> 00:10:16,000
and we don't want to use that variable

168
00:10:16,000 --> 00:10:19,000
until we actually give it a value, 

169
00:10:19,000 --> 00:10:21,000
so we declare it here. 

170
00:10:21,000 --> 00:10:24,000
Then we initialize it to be 42 below.

171
00:10:24,000 --> 00:10:28,000
Now, of course, we know this can be done on one line, int bar = 42.

172
00:10:28,000 --> 00:10:30,000
But just to be clear the multiple steps that are going on,

173
00:10:30,000 --> 00:10:34,000
the declaration and the initialization are happening separately here.

174
00:10:34,000 --> 00:10:38,000
It happens on one step, and the next one, int baz = bar + 1,

175
00:10:38,000 --> 00:10:44,000
this statement below, that increments baz, so at the end of this code block

176
00:10:44,000 --> 00:10:48,000
if we were to print the value of baz it would be 44

177
00:10:48,000 --> 00:10:52,000
because we declare and initialize it to be 1 > bar, 

178
00:10:52,000 --> 00:10:58,000
and then we increment it once more with the ++.

179
00:10:58,000 --> 00:11:02,000
We went over this pretty briefly, but it's good to have a general 

180
00:11:02,000 --> 00:11:04,000
understanding of what threads and events are.

181
00:11:04,000 --> 00:11:06,000
We mainly did this in Scratch,

182
00:11:06,000 --> 00:11:09,000
so you can think of threads as multiple sequences of code

183
00:11:09,000 --> 00:11:11,000
running at the same time.

184
00:11:11,000 --> 00:11:14,000
In actuality, it probably isn't running at the same time,

185
00:11:14,000 --> 00:11:17,000
but sort of abstractly we can think of it in that way.

186
00:11:17,000 --> 00:11:20,000
>> In Scratch, for example, we had the multiple sprites.

187
00:11:20,000 --> 00:11:22,000
It could be executing different code at the same time.

188
00:11:22,000 --> 00:11:26,000
One could be walking while the other is saying something 

189
00:11:26,000 --> 00:11:29,000
in a different part of the screen.

190
00:11:29,000 --> 00:11:34,000
Events are another way of separating out the logic

191
00:11:34,000 --> 00:11:37,000
between different elements of your code,

192
00:11:37,000 --> 00:11:40,000
and in Scratch we were able to simulate events using the Broadcast,

193
00:11:40,000 --> 00:11:43,000
and that's actually When I Receive, not When I Hear,

194
00:11:43,000 --> 00:11:47,000
but essentially it's a way to transmit information

195
00:11:47,000 --> 00:11:49,000
from one sprite to another.

196
00:11:49,000 --> 00:11:52,000
For example, you may want to transmit game over,

197
00:11:52,000 --> 00:11:56,000
and when another sprite receives game over,

198
00:11:56,000 --> 00:11:58,000
it responds in a certain way.

199
00:11:58,000 --> 00:12:03,000
It's an important model to understand for programming.

200
00:12:03,000 --> 00:12:07,000
Just to go over the basic Week 0, what we've gone over so far,

201
00:12:07,000 --> 00:12:10,000
let's look at this simple C program.

202
00:12:10,000 --> 00:12:14,000
The text may be a little bit small from here, but I'll go over it really quick.

203
00:12:14,000 --> 00:12:20,000
We're including 2 header files at the top, cs50.h and stdio.h.

204
00:12:20,000 --> 00:12:23,000
We're then defining a constant called limit to be 100.

205
00:12:23,000 --> 00:12:26,000
We're then implementing our main function.

206
00:12:26,000 --> 00:12:29,000
Since we don't use command line arguments here we need to put void

207
00:12:29,000 --> 00:12:32,000
as the arguments for main.

208
00:12:32,000 --> 00:12:38,000
We see int above main. That's the return type, hence return 0 at the bottom.

209
00:12:38,000 --> 00:12:41,000
And we're using CS50 library function get int

210
00:12:41,000 --> 00:12:45,000
to ask the user for input, and we store it in this variable x,

211
00:12:45,000 --> 00:12:51,000
so we declare x above, and we initialize it with x = GetInt.

212
00:12:51,000 --> 00:12:53,000
>> We then check to see if the user gave us good input.

213
00:12:53,000 --> 00:12:59,000
If it's ≥ LIMIT we want to return an error code of 1 and print an error message.

214
00:12:59,000 --> 00:13:02,000
And finally, if the user has given us good input

215
00:13:02,000 --> 00:13:08,000
we're going to square the number and print out that result.

216
00:13:08,000 --> 00:13:11,000
Just to make sure that those all hit home

217
00:13:11,000 --> 00:13:17,000
you can see the labels of different parts of the code here.

218
00:13:17,000 --> 00:13:19,000
I mentioned constant, header files.

219
00:13:19,000 --> 00:13:21,000
Oh, int x. Make sure to remember that's a local variable.

220
00:13:21,000 --> 00:13:24,000
That contrasts it from a global variable, which we'll talk about 

221
00:13:24,000 --> 00:13:27,000
a little bit later in the review session,

222
00:13:27,000 --> 00:13:30,000
and we are calling the library function printf, 

223
00:13:30,000 --> 00:13:34,000
so if we hadn't included the stdio.h header file

224
00:13:34,000 --> 00:13:37,000
we would not be able to call printf.

225
00:13:37,000 --> 00:13:42,000
And I believe the arrow that got cut off here is pointing to the %d,

226
00:13:42,000 --> 00:13:45,000
which is a formatting string in printf.

227
00:13:45,000 --> 00:13:52,000
It says print out this variable as a number, %d.

228
00:13:52,000 --> 00:13:58,000
And that is it for Week 0.

229
00:13:58,000 --> 00:14:06,000
Now Lucas is going to continue.

230
00:14:06,000 --> 00:14:08,000
Hey, guys. My name is Lucas.

231
00:14:08,000 --> 00:14:10,000
I'm a sophomore in the best house on campus, Mather, 

232
00:14:10,000 --> 00:14:14,000
and I'm going to talk a little bit about Week 1 and 2.1.

233
00:14:14,000 --> 00:14:16,000
[Week 1 and 2.1!] [Lucas Freitas]

234
00:14:16,000 --> 00:14:19,000
As Lexi was saying, when we started translating your code from Scratch to C

235
00:14:19,000 --> 00:14:23,000
one of the things that we noticed is that you cannot just

236
00:14:23,000 --> 00:14:26,000
write your code and run it using a green flag anymore.

237
00:14:26,000 --> 00:14:30,000
Actually, you have to use some steps to make your C program

238
00:14:30,000 --> 00:14:33,000
become an executable file.

239
00:14:33,000 --> 00:14:36,000
Basically what you do when you're writing a program is that

240
00:14:36,000 --> 00:14:40,000
you translate your idea into a language that a compiler can understand,

241
00:14:40,000 --> 00:14:44,000
so when you're writing a program in C

242
00:14:44,000 --> 00:14:47,000
what you're doing is actually writing something that your compiler is going to understand,

243
00:14:47,000 --> 00:14:50,000
and then the compiler is going to translate that code

244
00:14:50,000 --> 00:14:53,000
into something that your computer will understand.

245
00:14:53,000 --> 00:14:55,000
>> And the thing is, your computer is actually very dumb.

246
00:14:55,000 --> 00:14:57,000
Your computer can only understand 0s and 1s,

247
00:14:57,000 --> 00:15:01,000
so actually in the first computers people usually programmed 

248
00:15:01,000 --> 00:15:04,000
using 0s and 1s, but not anymore, thank God.

249
00:15:04,000 --> 00:15:07,000
We don't have to memorize the sequences for 0s and 1s 

250
00:15:07,000 --> 00:15:10,000
for a for loop or for a while loop and so on.

251
00:15:10,000 --> 00:15:13,000
That's why we have a compiler.

252
00:15:13,000 --> 00:15:17,000
What a compiler does is it basically translates the C code, 

253
00:15:17,000 --> 00:15:21,000
in our case, to a language that your computer will understand,

254
00:15:21,000 --> 00:15:25,000
which is the object code, and the compiler that we're using

255
00:15:25,000 --> 00:15:30,000
is called clang, so this is actually the symbol for clang.

256
00:15:30,000 --> 00:15:33,000
When you have your program, you have to do 2 things.

257
00:15:33,000 --> 00:15:37,000
First, you have to compile your program, and then you're going to run your program.

258
00:15:37,000 --> 00:15:41,000
To compile your program you have a lot of options to do so.

259
00:15:41,000 --> 00:15:44,000
The first one is to do clang program.c

260
00:15:44,000 --> 00:15:47,000
in which program is the name of your program.

261
00:15:47,000 --> 00:15:51,000
In this case you can see they're just saying "Hey, compile my program."

262
00:15:51,000 --> 00:15:56,000
You're not saying "I want this name for my program" or anything.

263
00:15:56,000 --> 00:15:58,000
>> The second option is giving a name to your program.

264
00:15:58,000 --> 00:16:02,000
You can say clang -o and then the name that you want 

265
00:16:02,000 --> 00:16:06,000
the executable file to be named as and then program.c.

266
00:16:06,000 --> 00:16:11,000
And you can also do make program, and see how in the first 2 cases

267
00:16:11,000 --> 00:16:15,000
I put .c, and in the third one I only have programs?

268
00:16:15,000 --> 00:16:18,000
Yeah, you actually should not put .c when you use make.

269
00:16:18,000 --> 00:16:22,000
Otherwise the compiler is actually going to yell at you.

270
00:16:22,000 --> 00:16:24,000
And also, I don't know if you guys remember,

271
00:16:24,000 --> 00:16:29,000
but a lot of times we also used -lcs50 or -lm.

272
00:16:29,000 --> 00:16:31,000
That is called linking.

273
00:16:31,000 --> 00:16:35,000
It just tells the compiler that you will use those libraries right there,

274
00:16:35,000 --> 00:16:39,000
so if you want to use cs50.h you actually have to type

275
00:16:39,000 --> 00:16:43,000
clang program.c -lcs50.

276
00:16:43,000 --> 00:16:45,000
If you don't do that, the compiler is not going to know 

277
00:16:45,000 --> 00:16:50,000
that you're using those functions in cs50.h.

278
00:16:50,000 --> 00:16:52,000
And when you want to run your program you have 2 options.

279
00:16:52,000 --> 00:16:57,000
If you did clang program.c you didn't give a name to your program.

280
00:16:57,000 --> 00:17:01,000
You have to run it using ./a.out.

281
00:17:01,000 --> 00:17:06,000
A.out is a standard name that clang gives your program if you don't give it a name.

282
00:17:06,000 --> 00:17:11,000
Otherwise you're going to do ./program if you gave a name to your program,

283
00:17:11,000 --> 00:17:15,000
and also if you did make program the name that a program is going to get

284
00:17:15,000 --> 00:17:23,000
is already going to be programmed the same name as the c file.

285
00:17:23,000 --> 00:17:26,000
Then we talked about data types and data. 

286
00:17:26,000 --> 00:17:31,000
>> Basically data types are the same thing as little boxes they use

287
00:17:31,000 --> 00:17:35,000
to store values, so data types are actually just like Pokémons.

288
00:17:35,000 --> 00:17:39,000
They come in all sizes and types.

289
00:17:39,000 --> 00:17:43,000
I don't know if that analogy makes sense.

290
00:17:43,000 --> 00:17:46,000
The data size actually depends on the machine architecture.

291
00:17:46,000 --> 00:17:49,000
All the data sizes that I'm going to show here 

292
00:17:49,000 --> 00:17:53,000
are actually for a 32-bit machine, which is the case of our appliance,

293
00:17:53,000 --> 00:17:56,000
but if you are actually coding your Mac or in a Windows also

294
00:17:56,000 --> 00:17:59,000
probably you're going to have a 64-bit machine,

295
00:17:59,000 --> 00:18:03,000
so remember that the data sizes that I'm going to show here

296
00:18:03,000 --> 00:18:06,000
are for the 32-bit machine.

297
00:18:06,000 --> 00:18:08,000
The first one that we saw was an int, 

298
00:18:08,000 --> 00:18:10,000
which is pretty straightforward.

299
00:18:10,000 --> 00:18:13,000
You use int to store an integer.

300
00:18:13,000 --> 00:18:16,000
We also saw the character, the char.

301
00:18:16,000 --> 00:18:20,000
If you want to use a letter or a little symbol you're probably going to use a char.

302
00:18:20,000 --> 00:18:26,000
A char has 1 byte, which means 8 bits, like Lexi said.

303
00:18:26,000 --> 00:18:31,000
Basically we have an ASCII Table that has 256 

304
00:18:31,000 --> 00:18:34,000
possible combinations of 0s and 1s,

305
00:18:34,000 --> 00:18:37,000
and then when you type a char it's going to translate

306
00:18:37,000 --> 00:18:44,000
the character that inputs you a number that you have in the ASCII table, like Lexi said.

307
00:18:44,000 --> 00:18:48,000
We also have the float, which we use to store decimal numbers.

308
00:18:48,000 --> 00:18:53,000
If you want to choose 3.14, for example, you're going to use a float

309
00:18:53,000 --> 00:18:55,000
or a double that has more precision.

310
00:18:55,000 --> 00:18:57,000
A float has 4 bytes.

311
00:18:57,000 --> 00:19:01,000
A double has 8 bytes, so the only difference is the precision.

312
00:19:01,000 --> 00:19:04,000
We also have a long that is used for integers,

313
00:19:04,000 --> 00:19:09,000
and you can see for a 32-bit machine an int and a long have the same size,

314
00:19:09,000 --> 00:19:13,000
so it doesn't really make sense to use a long in a 32-bit machine.

315
00:19:13,000 --> 00:19:17,000
>> But if you're using a Mac and 64-bit machine, actually a long has size 8,

316
00:19:17,000 --> 00:19:19,000
so it really depends on the architecture.

317
00:19:19,000 --> 00:19:22,000
For the 32-bit machine it doesn't make sense to use a long really.

318
00:19:22,000 --> 00:19:25,000
And then a long long, on the other hand, has 8 bytes,

319
00:19:25,000 --> 00:19:30,000
so it is very good if you want to have a longer integer.

320
00:19:30,000 --> 00:19:34,000
And finally, we have string, which is actually a char *, 

321
00:19:34,000 --> 00:19:37,000
which is a pointer to a char.

322
00:19:37,000 --> 00:19:40,000
It's very easy to think that the size of the string is going to be like 

323
00:19:40,000 --> 00:19:42,000
the number of characters that you have there,

324
00:19:42,000 --> 00:19:45,000
but actually the char * itself

325
00:19:45,000 --> 00:19:49,000
has the size of a pointer to a char, which is 4 bytes.

326
00:19:49,000 --> 00:19:52,000
The size of a char * is 4 bytes.

327
00:19:52,000 --> 00:19:56,000
It doesn't matter if you have a small word or a letter or anything.

328
00:19:56,000 --> 00:19:58,000
It's going to be 4 bytes.

329
00:19:58,000 --> 00:20:01,000
We also learned a little bit about casting,

330
00:20:01,000 --> 00:20:04,000
so as you can see, if you have, for example, a program that says

331
00:20:04,000 --> 00:20:08,000
int x = 3 and then printf("%d", x/2)

332
00:20:08,000 --> 00:20:12,000
do you guys know what it's going to print on screen?

333
00:20:12,000 --> 00:20:14,000
>> Someone?>>[Students] 2.

334
00:20:14,000 --> 00:20:16,000
1.>>1, yeah.

335
00:20:16,000 --> 00:20:20,000
When you do 3/2 it's going to get 1.5,

336
00:20:20,000 --> 00:20:24,000
but since we're using an integer it's going to ignore the decimal part,

337
00:20:24,000 --> 00:20:26,000
and you're going to have 1.

338
00:20:26,000 --> 00:20:29,000
If you don't want that to happen what you can do, for example,

339
00:20:29,000 --> 00:20:33,000
is declare a float y = x.

340
00:20:33,000 --> 00:20:40,000
Then x that used to be 3 is now going to be 3.000 in y.

341
00:20:40,000 --> 00:20:44,000
And then you can print the y/2. 

342
00:20:44,000 --> 00:20:50,000
Actually, I should have a 2. over there.

343
00:20:50,000 --> 00:20:55,000
It's going to do 3.00/2.00,

344
00:20:55,000 --> 00:20:58,000
and you're going to get 1.5.

345
00:20:58,000 --> 00:21:06,000
And we have this .2f just to ask for 2 decimal units in the decimal part.

346
00:21:06,000 --> 00:21:12,000
If you have .3f it's going to have actually 1.500.

347
00:21:12,000 --> 00:21:16,000
If it's 2 it's going to be 1.50.

348
00:21:16,000 --> 00:21:18,000
We also have this case here.

349
00:21:18,000 --> 00:21:22,000
If you do float x = 3.14 and then you printf x

350
00:21:22,000 --> 00:21:24,000
you're going to get 3.14.

351
00:21:24,000 --> 00:21:29,000
And if you do x = the int of x,

352
00:21:29,000 --> 00:21:34,000
which means treat x as an int and you print x now

353
00:21:34,000 --> 00:21:36,000
you're going to have 3.00.

354
00:21:36,000 --> 00:21:38,000
Does that make sense?

355
00:21:38,000 --> 00:21:41,000
Because you're first treating x as an integer, so you're ignoring the decimal part,

356
00:21:41,000 --> 00:21:45,000
and then you're printing x.

357
00:21:45,000 --> 00:21:47,000
And finally, you can also do this,

358
00:21:47,000 --> 00:21:52,000
int x = 65, and then you declare a char c = x,

359
00:21:52,000 --> 00:21:56,000
and then if you print the c you're actually going to get 

360
00:21:56,000 --> 00:21:59,000
A, so basically what you're doing here 

361
00:21:59,000 --> 00:22:02,000
is translating the integer into the character,

362
00:22:02,000 --> 00:22:05,000
just like the ASCII Table does.

363
00:22:05,000 --> 00:22:08,000
We also talked about math operators.

364
00:22:08,000 --> 00:22:14,000
Most of them are pretty straightforward, so +, -, *, /, 

365
00:22:14,000 --> 00:22:20,000
and also we talked about mod, which is the remainder of a division of 2 numbers.

366
00:22:20,000 --> 00:22:23,000
If you have 10 % 3, for example,

367
00:22:23,000 --> 00:22:27,000
it means divide 10 by 3, and what is the remainder?

368
00:22:27,000 --> 00:22:30,000
It's going to be 1, so it's actually very useful for a lot of the programs.

369
00:22:30,000 --> 00:22:38,000
For Vigenère and Caesar I'm pretty sure that all of you guys used mod.

370
00:22:38,000 --> 00:22:43,000
About math operators, be very careful when combining * and /.

371
00:22:43,000 --> 00:22:48,000
>> For example, if you do (3/2) * 2 what are you going to get?

372
00:22:48,000 --> 00:22:50,000
[Students] 2.

373
00:22:50,000 --> 00:22:54,000
Yeah, 2, because 3/2 is going to be 1.5,

374
00:22:54,000 --> 00:22:57,000
but since you're doing operations between 2 integers

375
00:22:57,000 --> 00:22:59,000
you're actually just going to consider 1, 

376
00:22:59,000 --> 00:23:03,000
and then 1 * 2 is going to be 2, so be very, very careful

377
00:23:03,000 --> 00:23:07,000
when doing arithmetic with integers because 

378
00:23:07,000 --> 00:23:12,000
you might get that 2 = 3, in that case.

379
00:23:12,000 --> 00:23:14,000
And also be very careful about precedence.

380
00:23:14,000 --> 00:23:21,000
You should usually use parentheses to be sure that you know what you're doing.

381
00:23:21,000 --> 00:23:27,000
Some useful shortcuts, of course, one is i++ or i += 1

382
00:23:27,000 --> 00:23:30,000
or using +=.

383
00:23:30,000 --> 00:23:34,000
That is the same thing as doing i = i + 1.

384
00:23:34,000 --> 00:23:39,000
You can also do i-- or i -= 1,

385
00:23:39,000 --> 00:23:42,000
which is the same thing as i = i -1,

386
00:23:42,000 --> 00:23:46,000
something you guys use a lot in for loops, at least.

387
00:23:46,000 --> 00:23:52,000
Also, for *, if you use *= and if you do, for example, 

388
00:23:52,000 --> 00:23:57,000
i *= 2 is the same thing as saying i = i * 2,

389
00:23:57,000 --> 00:23:59,000
and the same thing for division.

390
00:23:59,000 --> 00:24:08,000
If you do i /= 2 it's the same thing as i = i/2.

391
00:24:08,000 --> 00:24:10,000
>> Now about functions.

392
00:24:10,000 --> 00:24:13,000
You guys learned that functions are a very good strategy to save code

393
00:24:13,000 --> 00:24:16,000
while you're programming, so if you want to perform the same task

394
00:24:16,000 --> 00:24:20,000
in code again and again, probably you want to use a function

395
00:24:20,000 --> 00:24:25,000
just so you don't have to copy and paste the code over and over again.

396
00:24:25,000 --> 00:24:28,000
Actually, main is a function, and when I show you the format of a function

397
00:24:28,000 --> 00:24:32,000
you're going to see that that is pretty obvious.

398
00:24:32,000 --> 00:24:35,000
We also use functions from some libraries,

399
00:24:35,000 --> 00:24:39,000
for example, printf, GetIn, which is from the CS50 library,

400
00:24:39,000 --> 00:24:43,000
and other functions like toupper.

401
00:24:43,000 --> 00:24:46,000
All of those functions are actually implemented in other libraries,

402
00:24:46,000 --> 00:24:49,000
and when you put those tether files in the beginning of your program

403
00:24:49,000 --> 00:24:53,000
you're saying can you please give me the code for those functions

404
00:24:53,000 --> 00:24:57,000
so I don't have to implement them by myself?

405
00:24:57,000 --> 00:25:00,000
And you can also write your own functions, so when you start programming

406
00:25:00,000 --> 00:25:04,000
you realize that libraries don't have all the functions that you need.

407
00:25:04,000 --> 00:25:10,000
For the last pset, for example, we wrote draw, scramble, and lookup,

408
00:25:10,000 --> 00:25:13,000
and it's very, very important to be able to write functions 

409
00:25:13,000 --> 00:25:17,000
because they are useful, and we use them all the time in programming,

410
00:25:17,000 --> 00:25:19,000
and it saves a lot of code.

411
00:25:19,000 --> 00:25:21,000
The format of a function is this one.

412
00:25:21,000 --> 00:25:24,000
We have return type in the beginning. What is the return type?

413
00:25:24,000 --> 00:25:27,000
It's just when your function is going to return.

414
00:25:27,000 --> 00:25:29,000
If you have a function, for example, factorial, 

415
00:25:29,000 --> 00:25:31,000
that is going to calculate a factorial of an integer,

416
00:25:31,000 --> 00:25:34,000
probably it's going to return an integer also.

417
00:25:34,000 --> 00:25:37,000
Then the return type is going to be int.

418
00:25:37,000 --> 00:25:41,000
Printf actually has a return type void

419
00:25:41,000 --> 00:25:43,000
because you're not returning anything.

420
00:25:43,000 --> 00:25:45,000
You're just printing things to the screen

421
00:25:45,000 --> 00:25:48,000
and quitting the function afterwards.

422
00:25:48,000 --> 00:25:51,000
Then you have the name of the function that you can choose.

423
00:25:51,000 --> 00:25:55,000
You should be a little reasonable, like don't choose a name like xyz

424
00:25:55,000 --> 00:25:58,000
or like x2f.

425
00:25:58,000 --> 00:26:02,000
Try to make up a name that makes sense.

426
00:26:02,000 --> 00:26:04,000
>> For example, if it's factorial, say factorial.

427
00:26:04,000 --> 00:26:08,000
If it's a function that is going to draw something, name it draw.

428
00:26:08,000 --> 00:26:11,000
And then we have the parameters, which are also called arguments,

429
00:26:11,000 --> 00:26:14,000
which are like the resources that your function needs

430
00:26:14,000 --> 00:26:17,000
from your code to perform its task.

431
00:26:17,000 --> 00:26:20,000
If you want to calculate the factorial of a number

432
00:26:20,000 --> 00:26:23,000
probably you need to have a number to calculate a factorial.

433
00:26:23,000 --> 00:26:27,000
One of the arguments that you're going to have is the number itself.

434
00:26:27,000 --> 00:26:31,000
And then it's going to do something and return the value at the end

435
00:26:31,000 --> 00:26:35,000
unless it's a void function.

436
00:26:35,000 --> 00:26:37,000
Let's see an example. 

437
00:26:37,000 --> 00:26:40,000
If I want to write a function that sums all the numbers in an array of integers,

438
00:26:40,000 --> 00:26:43,000
first of all, the return type is going to be int

439
00:26:43,000 --> 00:26:46,000
because I have an array of integers.

440
00:26:46,000 --> 00:26:51,000
And then I'm going to have the function name like sumArray,

441
00:26:51,000 --> 00:26:54,000
and then it's going to take the array itself, to int nums,

442
00:26:54,000 --> 00:26:58,000
and then the length of the array so I know how many numbers I have to sum.

443
00:26:58,000 --> 00:27:02,000
Then I have to initialize a variable called sum, for example, to 0,

444
00:27:02,000 --> 00:27:08,000
and every time I see an element in the array I should add it to sum, so I did a for loop.

445
00:27:08,000 --> 00:27:15,000
Just like Lexi said, you do int i = 0, i < length and i++.

446
00:27:15,000 --> 00:27:20,000
And for every element in the array I did sum += nums[i], 

447
00:27:20,000 --> 00:27:24,000
and then I returned the sum, so it's very simple, and it saves a lot of code

448
00:27:24,000 --> 00:27:28,000
if you're using this function a lot of times.

449
00:27:28,000 --> 00:27:32,000
Then we took a look at conditions.

450
00:27:32,000 --> 00:27:38,000
We have if, else, and else if.

451
00:27:38,000 --> 00:27:42,000
Let's see what is the difference between those.

452
00:27:42,000 --> 00:27:45,000
Take a look at these 2 codes. What is the difference between them?

453
00:27:45,000 --> 00:27:49,000
The first one has—basically the codes want you to tell

454
00:27:49,000 --> 00:27:51,000
if a number is +, -, or 0.

455
00:27:51,000 --> 00:27:55,000
The first one says if it's > 0 then it's positive.

456
00:27:55,000 --> 00:28:00,000
If it's = to 0 then it's 0, and if it's < 0 then it's negative.

457
00:28:00,000 --> 00:28:04,000
>> And the other one is doing if, else if, else.

458
00:28:04,000 --> 00:28:07,000
The difference between the two is that this one is actually going to

459
00:28:07,000 --> 00:28:13,000
check if > 0, < 0 or = 0 three times,

460
00:28:13,000 --> 00:28:17,000
so if you have the number 2, for example, it's going to come here and say

461
00:28:17,000 --> 00:28:21,000
if (x > 0), and it's going to say yes, so I print positive.

462
00:28:21,000 --> 00:28:25,000
But even though I know that it's > 0 and it's not going to be 0 or < 0

463
00:28:25,000 --> 00:28:29,000
I'm still going to do is it 0, is it < 0, 

464
00:28:29,000 --> 00:28:33,000
so I'm actually going inside of ifs that I didn't have to

465
00:28:33,000 --> 00:28:38,000
because I already know that it's not going to satisfy any of these conditions.

466
00:28:38,000 --> 00:28:41,000
I can use the if, else if, else statement.

467
00:28:41,000 --> 00:28:45,000
It basically says if x = 0 I print the positive.

468
00:28:45,000 --> 00:28:48,000
If it's not, I'm going to also test this.

469
00:28:48,000 --> 00:28:51,000
If it's 2 not I'm going to do this.

470
00:28:51,000 --> 00:28:54,000
Basically if I had x = 2 you would say 

471
00:28:54,000 --> 00:28:57,000
if (x > 0), yes, so print this.

472
00:28:57,000 --> 00:29:00,000
Now that I know that it's > 0 and that it satisfied the first if

473
00:29:00,000 --> 00:29:02,000
I'm not even going to run this code.

474
00:29:02,000 --> 00:29:09,000
The code runs faster, actually, 3 times faster if you use this.

475
00:29:09,000 --> 00:29:11,000
We also learned about and and or.

476
00:29:11,000 --> 00:29:15,000
I'm not going to go through this because Lexi already talked about them.

477
00:29:15,000 --> 00:29:17,000
It's just the && and | | operator.

478
00:29:17,000 --> 00:29:21,000
>> The only thing I'll say is be careful when you have 3 conditions.

479
00:29:21,000 --> 00:29:24,000
Use parentheses because it's very confusing when you have a condition

480
00:29:24,000 --> 00:29:27,000
and another one or another one.

481
00:29:27,000 --> 00:29:30,000
Use parentheses just to be sure that your conditions make sense

482
00:29:30,000 --> 00:29:34,000
because in that case, for example, you can imagine that

483
00:29:34,000 --> 00:29:38,000
it could be the first condition and one or the other

484
00:29:38,000 --> 00:29:41,000
or the 2 conditions combined in an and

485
00:29:41,000 --> 00:29:45,000
or the third one, so just be careful.

486
00:29:45,000 --> 00:29:48,000
And finally, we talked about switches.

487
00:29:48,000 --> 00:29:53,000
A switch is very useful when you have a variable.

488
00:29:53,000 --> 00:29:55,000
Let's say that you have a variable like n

489
00:29:55,000 --> 00:29:59,000
that can be 0, 1, or 2, and for each of those cases

490
00:29:59,000 --> 00:30:01,000
you're going to perform a task.

491
00:30:01,000 --> 00:30:04,000
You can say switch the variable, and it indicates that

492
00:30:04,000 --> 00:30:08,000
the value then is like value1 I'm going to do this,

493
00:30:08,000 --> 00:30:12,000
and then I break, which means I'm not going to look at any of the other cases

494
00:30:12,000 --> 00:30:15,000
because we already satisfied that case

495
00:30:15,000 --> 00:30:20,000
and then value2 and so on, and I also can have a default switch.

496
00:30:20,000 --> 00:30:24,000
That means if it doesn't satisfy any of the cases that I had

497
00:30:24,000 --> 00:30:29,000
that I'm going to do something else, but that's optional.

498
00:30:29,000 --> 00:30:36,000
That's all for me. Now let's have Tommy.

499
00:30:36,000 --> 00:30:41,000
All right, this is going to be Week 3-ish.

500
00:30:41,000 --> 00:30:45,000
These are some of the topics we'll be covering, crypto, scope, arrays, et cetera.

501
00:30:45,000 --> 00:30:49,000
Just a quick word on crypto. We're not going to hammer this home.

502
00:30:49,000 --> 00:30:52,000
>> We did this in pset 2, but for the quiz make sure you know the difference

503
00:30:52,000 --> 00:30:54,000
between the Caesar cipher and the Vigenère cipher,

504
00:30:54,000 --> 00:30:57,000
how both of those ciphers work and what it's like to encrypt

505
00:30:57,000 --> 00:30:59,000
and decrypt text using those 2 ciphers.

506
00:30:59,000 --> 00:31:03,000
Remember, the Caesar cipher simply rotates each character by the same amount,

507
00:31:03,000 --> 00:31:06,000
making sure you mod by the number of letters in the alphabet.

508
00:31:06,000 --> 00:31:09,000
And the Vigenère cipher, on the other hand, rotates each character

509
00:31:09,000 --> 00:31:12,000
by a different amount, so rather than saying

510
00:31:12,000 --> 00:31:15,000
every character rotated by 3 Vigenère will rotate each character

511
00:31:15,000 --> 00:31:17,000
by a different amount depending on some keyword 

512
00:31:17,000 --> 00:31:20,000
where each letter in the keyword represents some different amount

513
00:31:20,000 --> 00:31:26,000
to rotate the clear text by.

514
00:31:26,000 --> 00:31:28,000
Let's first talk about variable scope.

515
00:31:28,000 --> 00:31:30,000
There are 2 different types of variables.

516
00:31:30,000 --> 00:31:33,000
We have local variables, and these are going to be defined

517
00:31:33,000 --> 00:31:36,000
outside of main or outside any function or block,

518
00:31:36,000 --> 00:31:39,000
and these will be accessible anywhere in your program.

519
00:31:39,000 --> 00:31:41,000
If you have a function and in that function is a while loop

520
00:31:41,000 --> 00:31:44,000
the big global variable is accessible everywhere.

521
00:31:44,000 --> 00:31:48,000
A local variable, on the other hand, is scoped to the place where it is defined. 

522
00:31:48,000 --> 00:31:53,000
>> If you have a function here, for example, we have this function g, 

523
00:31:53,000 --> 00:31:56,000
and inside of g there is a variable here called y,

524
00:31:56,000 --> 00:31:58,000
and that means that this is a local variable.

525
00:31:58,000 --> 00:32:00,000
Even though this variable is called y

526
00:32:00,000 --> 00:32:03,000
and this variable is called y these 2 functions

527
00:32:03,000 --> 00:32:06,000
have no idea what each other's local variables are.

528
00:32:06,000 --> 00:32:10,000
On the other hand, up here we say int x = 5,

529
00:32:10,000 --> 00:32:12,000
and this is outside the scope of any function.

530
00:32:12,000 --> 00:32:16,000
It's outside the scope of main, so this is a global variable.

531
00:32:16,000 --> 00:32:20,000
That means that inside of these 2 functions when I say x-- or x++

532
00:32:20,000 --> 00:32:26,000
I'm accessing the same x whereby this y and this y are different variables.

533
00:32:26,000 --> 00:32:30,000
That's the difference between a global variable and a local variable.

534
00:32:30,000 --> 00:32:33,000
As far as design is concerned, sometimes it's probably a better idea

535
00:32:33,000 --> 00:32:37,000
to keep variables local whenever you possibly can

536
00:32:37,000 --> 00:32:39,000
since having a bunch of global variables can get really confusing.

537
00:32:39,000 --> 00:32:42,000
If you have a bunch of functions all modifying the same thing

538
00:32:42,000 --> 00:32:45,000
you might forget what if this function accidentally modifies this global,

539
00:32:45,000 --> 00:32:47,000
and this other function doesn't know about it, 

540
00:32:47,000 --> 00:32:50,000
and it does get pretty confusing as  you get more code.

541
00:32:50,000 --> 00:32:53,000
Keeping variables local whenever you possibly can

542
00:32:53,000 --> 00:32:56,000
is just good design.

543
00:32:56,000 --> 00:33:00,000
Arrays, remember, are simply lists of elements of the same type.

544
00:33:00,000 --> 00:33:04,000
Inside of C I can't have a list like 1, 2.0, hello.

545
00:33:04,000 --> 00:33:06,000
We just can't do that.

546
00:33:06,000 --> 00:33:11,000
>> When we declare an array in C all of the elements have to be of the same type.

547
00:33:11,000 --> 00:33:14,000
Here I have an array of 3 integers.

548
00:33:14,000 --> 00:33:18,000
Here I have the length of the array, but if I'm just declaring it in this syntax

549
00:33:18,000 --> 00:33:21,000
where I specify what all of the elements are I don't technically need this 3.

550
00:33:21,000 --> 00:33:25,000
The compiler is smart enough to figure out how big the array should be.

551
00:33:25,000 --> 00:33:28,000
Now when I want to get or set the value of an array

552
00:33:28,000 --> 00:33:30,000
this is the syntax to do that.

553
00:33:30,000 --> 00:33:33,000
This will actually modify the second element of the array because, remember,

554
00:33:33,000 --> 00:33:36,000
numbering starts at 0, not at 1.

555
00:33:36,000 --> 00:33:42,000
If I want to read that value I can say something like int x = array[1].

556
00:33:42,000 --> 00:33:44,000
Or if I want to set that value, like I'm doing here,

557
00:33:44,000 --> 00:33:47,000
I can say array[1] = 4.

558
00:33:47,000 --> 00:33:50,000
That time accessing elements by their index

559
00:33:50,000 --> 00:33:52,000
or their position or where they are in the array,

560
00:33:52,000 --> 00:33:57,000
and that listing starts at 0.

561
00:33:57,000 --> 00:34:00,000
We can also have arrays of arrays,

562
00:34:00,000 --> 00:34:03,000
and this is called a multi-dimensional array.

563
00:34:03,000 --> 00:34:05,000
When we have a multi-dimensional array

564
00:34:05,000 --> 00:34:07,000
that means we can have something like rows and columns,

565
00:34:07,000 --> 00:34:11,000
and this is just one way of visualizing this or thinking about it.

566
00:34:11,000 --> 00:34:14,000
When I have a multi-dimensional array that means I'm going to start needing

567
00:34:14,000 --> 00:34:17,000
more than 1 index because if I have a grid

568
00:34:17,000 --> 00:34:19,000
just saying what row you're in doesn't give us a number.

569
00:34:19,000 --> 00:34:22,000
That's really just going to give us a list of numbers.

570
00:34:22,000 --> 00:34:25,000
Let's say I have this array here.

571
00:34:25,000 --> 00:34:30,000
I have an array called grid, and I'm saying it's 2 rows and 3 columns,

572
00:34:30,000 --> 00:34:32,000
and so this is one way of visualizing it.

573
00:34:32,000 --> 00:34:37,000
When I say I want to get the element at [1] [2]

574
00:34:37,000 --> 00:34:41,000
that means that because these are rows first and then columns

575
00:34:41,000 --> 00:34:44,000
I'm going to jump to row 1 since I said 1.

576
00:34:44,000 --> 00:34:49,000
>> Then I'm going to come over here to column 2, and I'm going to get the value 6.

577
00:34:49,000 --> 00:34:51,000
Make sense?

578
00:34:51,000 --> 00:34:55,000
Multi-dimensional arrays, remember, are technically just an array of arrays.

579
00:34:55,000 --> 00:34:57,000
We can have arrays of arrays of arrays.

580
00:34:57,000 --> 00:35:00,000
We can keep going, but really one way to think about

581
00:35:00,000 --> 00:35:03,000
how this is being laid out and what's going on is to visualize it 

582
00:35:03,000 --> 00:35:09,000
in a grid like this.

583
00:35:09,000 --> 00:35:12,000
When we pass arrays to functions, they're going to behave 

584
00:35:12,000 --> 00:35:16,000
a little bit differently than when we pass regular variables to functions

585
00:35:16,000 --> 00:35:18,000
like passing an int or a float.

586
00:35:18,000 --> 00:35:21,000
When we pass in an int or char or any of these other data types

587
00:35:21,000 --> 00:35:24,000
we just took a look at if the function modifies

588
00:35:24,000 --> 00:35:28,000
the value of that variable that change is not going to propagate up

589
00:35:28,000 --> 00:35:32,000
to the calling function.

590
00:35:32,000 --> 00:35:35,000
With an array, on the other hand, that will happen.

591
00:35:35,000 --> 00:35:39,000
If I pass in an array to some function and that function changes some of the elements,

592
00:35:39,000 --> 00:35:43,000
when I come back up to the function that called it

593
00:35:43,000 --> 00:35:47,000
my array is now going to be different, and the vocabulary for that

594
00:35:47,000 --> 00:35:50,000
is arrays are passed by reference, as we'll see later.

595
00:35:50,000 --> 00:35:53,000
This is related to how pointers work, where these basic data types, 

596
00:35:53,000 --> 00:35:55,000
on the other hand, are passed by value.

597
00:35:55,000 --> 00:35:59,000
>> We can think of that as making a copy of some variable and then passing in the copy. 

598
00:35:59,000 --> 00:36:01,000
It doesn't matter what we do with that variable.

599
00:36:01,000 --> 00:36:06,000
The calling function will not be aware that it was changed.

600
00:36:06,000 --> 00:36:10,000
Arrays are just a little bit different in that regard.

601
00:36:10,000 --> 00:36:13,000
For example, as we just saw, main is simply a function

602
00:36:13,000 --> 00:36:15,000
that can take in 2 arguments.

603
00:36:15,000 --> 00:36:20,000
The first argument to the main function is argc, or the number of arguments,

604
00:36:20,000 --> 00:36:23,000
and the second argument is called argv, 

605
00:36:23,000 --> 00:36:27,000
and those are the actual values of those arguments. 

606
00:36:27,000 --> 00:36:30,000
Let's say I have a program called this.c, 

607
00:36:30,000 --> 00:36:34,000
and I say make this, and I'm going to run this at the command line.

608
00:36:34,000 --> 00:36:38,000
Now to pass in some arguments to my program called this,

609
00:36:38,000 --> 00:36:42,000
I could say something like ./this is cs 50.

610
00:36:42,000 --> 00:36:45,000
This is what we imagine David to do every day at the terminal.

611
00:36:45,000 --> 00:36:48,000
But now the main function inside of that program

612
00:36:48,000 --> 00:36:52,000
has these values, so argc is 4.

613
00:36:52,000 --> 00:36:56,000
It might be a little confusing because really we're only passing in is cs 50.

614
00:36:56,000 --> 00:36:58,000
That's only 3.

615
00:36:58,000 --> 00:37:02,000
But remember that the first element of argv or the first argument

616
00:37:02,000 --> 00:37:05,000
is the name of the function itself.

617
00:37:05,000 --> 00:37:07,190
So that means that we have 4 things here, 

618
00:37:07,190 --> 00:37:10,530
and the first element is going to be ./this. 

619
00:37:10,530 --> 00:37:12,970
And this will be represented as a string. 

620
00:37:12,970 --> 00:37:18,590
Then the remaining elements are what we typed in after the name of the program. 

621
00:37:18,590 --> 00:37:22,720
So just as an aside, as we probably saw in pset 2, 

622
00:37:22,720 --> 00:37:28,780
remember that the string 50 is ≠ the integer 50. 

623
00:37:28,780 --> 00:37:32,520
So we can't say something like, 'int x = argv 3.' 

624
00:37:32,520 --> 00:37:36,470
>> That's just not going to make sense, because this is a string, and this is an integer. 

625
00:37:36,470 --> 00:37:38,510
So if you want to convert between the 2, remember, we're going to 

626
00:37:38,510 --> 00:37:40,810
have this magic function called atoi. 

627
00:37:40,810 --> 00:37:46,270
That takes a string and returns the integer represented inside of that string. 

628
00:37:46,270 --> 00:37:48,360
So that's an easy mistake to make on the quiz,

629
00:37:48,360 --> 00:37:51,590
just thinking that this will automatically be the correct type. 

630
00:37:51,590 --> 00:37:53,860
But just know that these will always be strings 

631
00:37:53,860 --> 00:38:00,920
even if the string only contains an integer or a character or a float. 

632
00:38:00,920 --> 00:38:03,380
So now let's talk about running time. 

633
00:38:03,380 --> 00:38:06,700
When we have all these algorithms that do all these crazy things, 

634
00:38:06,700 --> 00:38:11,580
it becomes really useful to ask the question, "How long do they take?"

635
00:38:11,580 --> 00:38:15,500
We represent that with something called asymptotic notation. 

636
00:38:15,500 --> 00:38:18,430
So this means that--well, let's say we give our algorithm 

637
00:38:18,430 --> 00:38:20,840
some really, really, really big input. 

638
00:38:20,840 --> 00:38:23,840
We want to ask the question, "How long is it going to take?

639
00:38:23,840 --> 00:38:26,370
How many steps will it take our algorithm to run 

640
00:38:26,370 --> 00:38:29,980
as a function of the size of the input?"

641
00:38:29,980 --> 00:38:33,080
So the first way we can describe run time is with big O. 

642
00:38:33,080 --> 00:38:35,380
And this is our worst-case running time. 

643
00:38:35,380 --> 00:38:38,590
So if we want to sort an array, and we give our algorithm an array

644
00:38:38,590 --> 00:38:41,000
that's in descending order when it should be in ascending order, 

645
00:38:41,000 --> 00:38:43,130
that's going to be the worst case. 

646
00:38:43,130 --> 00:38:49,800
This is our upper bound in the maximum length of time our algorithm will take. 

647
00:38:49,800 --> 00:38:54,740
On the other hand, this Ω is going to describe best-case running time. 

648
00:38:54,740 --> 00:38:58,210
So if we give an already sorted array to a sorting algorithm, 

649
00:38:58,210 --> 00:39:00,940
how long will it take to sort it? 

650
00:39:00,940 --> 00:39:06,610
And this, then, describes a lower bound on running time. 

651
00:39:06,610 --> 00:39:10,980
So here are just some words that describe some common running times. 

652
00:39:10,980 --> 00:39:13,120
These are in ascending order. 

653
00:39:13,120 --> 00:39:16,060
The fastest running time we have is called constant. 

654
00:39:16,060 --> 00:39:19,800
>> That means no matter how many elements we give our algorithm, 

655
00:39:19,800 --> 00:39:22,280
no matter how big our array is, sorting it 

656
00:39:22,280 --> 00:39:26,510
or doing whatever we're doing to the array will always take the same amount of time. 

657
00:39:26,510 --> 00:39:30,270
So we can represent that just with a 1, which is a constant. 

658
00:39:30,270 --> 00:39:32,410
We also looked at logarithmic run time. 

659
00:39:32,410 --> 00:39:34,800
So something like binary search is logarithmic, 

660
00:39:34,800 --> 00:39:37,140
where we cut the problem in half every time

661
00:39:37,140 --> 00:39:40,970
and then things just get higher from there. 

662
00:39:40,970 --> 00:39:43,580
And if you're ever writing an O of any factorial algorithm, 

663
00:39:43,580 --> 00:39:47,850
you probably shouldn't consider this as your day job. 

664
00:39:47,850 --> 00:39:53,910
When we compare running times it's important to keep in mind these things. 

665
00:39:53,910 --> 00:39:57,760
So if I have an algorithm that's O(n), and someone else 

666
00:39:57,760 --> 00:40:03,590
has an algorithm of O(2n) these are actually asymptotically equivalent. 

667
00:40:03,590 --> 00:40:06,590
So if we imagine n to be a big number like eleventy billion:

668
00:40:06,590 --> 00:40:13,090
so when we're comparing eleventy billion to something like eleventy billion + 3, 

669
00:40:13,090 --> 00:40:17,640
suddenly that +3 doesn't really make a big difference anymore. 

670
00:40:17,640 --> 00:40:20,980
That's why we're going to start considering these things to be equivalent. 

671
00:40:20,980 --> 00:40:24,220
So things like these constants here, there's 2 x this, or adding a 3, 

672
00:40:24,220 --> 00:40:27,180
these are just constants, and these are going to drop up. 

673
00:40:27,180 --> 00:40:32,480
So that's why all 3 of these run times are the same as saying they're O(n). 

674
00:40:32,480 --> 00:40:37,490
Similarly, if we have 2 other run times, let's say O(n³ + 2n²), we can add 

675
00:40:37,490 --> 00:40:42,070
+ n, + 7, and then we have another run time that's just O(n³).

676
00:40:42,070 --> 00:40:46,290
again, these are the same thing because these--these are not the same. 

677
00:40:46,290 --> 00:40:49,840
These are the same things, sorry. So these are the same because 

678
00:40:49,840 --> 00:40:53,090
this n³ is going to dominate this 2n². 

679
00:40:53,090 --> 00:40:59,130
>> What is not the same thing is if we have run times like O(n³) and O(n²)

680
00:40:59,130 --> 00:41:02,820
because this n³ is much larger than this n². 

681
00:41:02,820 --> 00:41:05,470
So if we have exponents, suddenly this starts to matter, 

682
00:41:05,470 --> 00:41:08,280
but when we're just dealing with factors as we are up here, 

683
00:41:08,280 --> 00:41:12,810
then it's not going to matter because they are just going to drop out. 

684
00:41:12,810 --> 00:41:16,760
Let's take a look at some of the algorithms we've seen so far

685
00:41:16,760 --> 00:41:19,260
and talk about their run time. 

686
00:41:19,260 --> 00:41:23,850
The first way of looking for a number in a list, that we saw, was linear search. 

687
00:41:23,850 --> 00:41:26,950
And the implementation of linear search is super straightforward. 

688
00:41:26,950 --> 00:41:30,490
We just have a list, and we're going to look at every single element in the list 

689
00:41:30,490 --> 00:41:34,260
until we find the number we're looking for. 

690
00:41:34,260 --> 00:41:38,370
So that means that in the worst case, this O(n). 

691
00:41:38,370 --> 00:41:40,860
And the worst case here could be if the element is 

692
00:41:40,860 --> 00:41:45,710
the last element, then using linear search we have to look at every single element 

693
00:41:45,710 --> 00:41:50,180
until we get to the last one in order to know that it was actually in the list. 

694
00:41:50,180 --> 00:41:52,910
We can't just give up halfway and say, "It's probably not there." 

695
00:41:52,910 --> 00:41:55,980
With linear search we have to look at the whole thing. 

696
00:41:55,980 --> 00:41:59,090
The best-case running time, on the other hand, is constant

697
00:41:59,090 --> 00:42:04,200
because in the best case the element we're looking for is just the first one in the list. 

698
00:42:04,200 --> 00:42:08,930
So it's going to take us exactly 1 step, no matter how big the list is

699
00:42:08,930 --> 00:42:12,140
if we're looking for the first element every time. 

700
00:42:12,140 --> 00:42:15,390
>> So when you search, remember, it does not require that our list be sorted. 

701
00:42:15,390 --> 00:42:19,430
Because we're simply going to look over every single element, and it doesn't really matter 

702
00:42:19,430 --> 00:42:23,560
what order those elements are in. 

703
00:42:23,560 --> 00:42:28,110
A more intelligent search algorithm is something like binary search. 

704
00:42:28,110 --> 00:42:31,500
Remember, the implementation of binary search is when you're going to 

705
00:42:31,500 --> 00:42:34,320
keep looking at the middle of the list. 

706
00:42:34,320 --> 00:42:38,000
And because we're looking at the middle, we require that the list is sorted

707
00:42:38,000 --> 00:42:40,580
or else we don't know where the middle is, and we have to look over

708
00:42:40,580 --> 00:42:44,480
the whole list to find it, and then at that point we're just wasting time. 

709
00:42:44,480 --> 00:42:48,480
So if we have a sorted list and we find the middle, we're going to compare the middle

710
00:42:48,480 --> 00:42:51,590
to the element we're looking for. 

711
00:42:51,590 --> 00:42:54,640
If it's too high, then we can forget the right half

712
00:42:54,640 --> 00:42:57,810
because we know that if our element is already too high

713
00:42:57,810 --> 00:43:01,080
and everything to the right of this element is even higher,

714
00:43:01,080 --> 00:43:02,760
then we don't need to look there anymore. 

715
00:43:02,760 --> 00:43:05,430
Where on the other hand, if our element is too low, 

716
00:43:05,430 --> 00:43:08,700
we know everything to the left of that element is also too low, 

717
00:43:08,700 --> 00:43:11,390
so it doesn't really make sense to look there, either. 

718
00:43:11,390 --> 00:43:15,760
This way, with every step and every time we look at the midpoint of the list, 

719
00:43:15,760 --> 00:43:19,060
we're going to cut our problem in half because suddenly we know 

720
00:43:19,060 --> 00:43:23,040
a whole bunch of numbers that can't be the one we're looking for. 

721
00:43:23,040 --> 00:43:26,950
>> In pseudocode this would look something like this,

722
00:43:26,950 --> 00:43:30,990
and because we're cutting the list in half every single time,

723
00:43:30,990 --> 00:43:34,920
our worst-case run time jumps from linear to logarithmic. 

724
00:43:34,920 --> 00:43:39,260
So suddenly we have log-in steps in order to find an element in a list. 

725
00:43:39,260 --> 00:43:42,460
The best-case running time, though, is still constant 

726
00:43:42,460 --> 00:43:45,180
because now, let's just say that the element we're looking for is

727
00:43:45,180 --> 00:43:48,380
always the exact middle of the original list. 

728
00:43:48,380 --> 00:43:52,080
So we can grow our list as big as we want, but if the element we're looking for is at the middle, 

729
00:43:52,080 --> 00:43:54,910
then it's only going to take us 1 step. 

730
00:43:54,910 --> 00:44:00,920
So that's why we're O(log n) and Ω(1) or constant. 

731
00:44:00,920 --> 00:44:04,510
Let's actually run binary search on this list. 

732
00:44:04,510 --> 00:44:08,020
So let's say that we're looking for the element 164. 

733
00:44:08,020 --> 00:44:11,650
The first thing we are going to do is find the midpoint of this list. 

734
00:44:11,650 --> 00:44:15,060
It just so happens that the midpoint is going to fall in between these 2 numbers, 

735
00:44:15,060 --> 00:44:18,960
so let's just arbitrarily say, every time the midpoint falls between 2 numbers, 

736
00:44:18,960 --> 00:44:21,150
let's just round up. 

737
00:44:21,150 --> 00:44:24,330
We just need to make sure we do this every step of the way. 

738
00:44:24,330 --> 00:44:29,040
So we're going to round up, and we're going to say that 161 is the middle of our list. 

739
00:44:29,040 --> 00:44:34,640
So 161 < 164, and every element to the left of 161 

740
00:44:34,640 --> 00:44:39,120
is also < 164, so we know that it's not going to help us at all 

741
00:44:39,120 --> 00:44:42,690
to start looking over here because the element we're looking for can't be there. 

742
00:44:42,690 --> 00:44:47,060
So what we can do is we can just forget about that whole left half of the list, 

743
00:44:47,060 --> 00:44:51,700
and now only consider from the right of the 161 onward. 

744
00:44:51,700 --> 00:44:54,050
>> So again, this is the midpoint; let's just round up. 

745
00:44:54,050 --> 00:44:56,260
Now 175 is too big. 

746
00:44:56,260 --> 00:44:59,180
So we know it's not going to help us looking here or here, 

747
00:44:59,180 --> 00:45:06,610
so we can just throw that away, and eventually we'll hit the 164. 

748
00:45:06,610 --> 00:45:10,560
Any questions on binary search? 

749
00:45:10,560 --> 00:45:14,180
Let's move on from searching through an already-sorted list 

750
00:45:14,180 --> 00:45:17,660
to actually taking a list of numbers in any order

751
00:45:17,660 --> 00:45:20,960
and making that list in ascending order. 

752
00:45:20,960 --> 00:45:24,060
The first algorithm we looked at was called bubble sort. 

753
00:45:24,060 --> 00:45:27,300
And this would be simpler of the algorithms we saw. 

754
00:45:27,300 --> 00:45:32,970
Bubble sort says that when any 2 elements inside the list are out of place, 

755
00:45:32,970 --> 00:45:36,500
meaning there is a higher number to the left of a lower number, 

756
00:45:36,500 --> 00:45:40,190
then we're going to swap them, because that means that the list will be 

757
00:45:40,190 --> 00:45:42,860
"more sorted" than it was before. 

758
00:45:42,860 --> 00:45:45,180
And we're just going to continue this process again and again and again

759
00:45:45,180 --> 00:45:52,100
until eventually the elements kind of bubble to their correct location and we have a sorted list. 

760
00:45:52,100 --> 00:45:57,230
>> The run time of this is going to be O(n²). Why?

761
00:45:57,230 --> 00:46:00,370
Well, because in the worst case, we're going to take every element, and 

762
00:46:00,370 --> 00:46:04,570
we're going to end up comparing it to every other element in the list. 

763
00:46:04,570 --> 00:46:08,030
But in the best case, we have an already sorted list, bubble sort's 

764
00:46:08,030 --> 00:46:12,230
just going to go through once, say "Nope. I didn't make any swaps, so I'm done."

765
00:46:12,230 --> 00:46:17,410
So we have a best-case running time of Ω(n). 

766
00:46:17,410 --> 00:46:20,680
Let's run bubble sort on a list. 

767
00:46:20,680 --> 00:46:23,560
Or first, let's just look at some pseudocode really quickly. 

768
00:46:23,560 --> 00:46:28,160
We want to say we want to keep track of, in every iteration of the loop, 

769
00:46:28,160 --> 00:46:32,190
keep track of whether or not we changed any elements. 

770
00:46:32,190 --> 00:46:37,610
So the reason for that is, we're going to stop when we have not swapped any elements. 

771
00:46:37,610 --> 00:46:41,980
So at the start of our loop we haven't swapped anything, so we'll say that's false. 

772
00:46:41,980 --> 00:46:47,170
Now, we're going to go through the list and compare element i to element i + 1

773
00:46:47,170 --> 00:46:50,310
and if it is the case that there is a bigger number to the left of a smaller number, 

774
00:46:50,310 --> 00:46:52,310
then we're just going to swap them. 

775
00:46:52,310 --> 00:46:54,490
>> And then we're going to remember that we swapped an element. 

776
00:46:54,490 --> 00:46:58,900
That means that we need to go through the list at least 1 more time

777
00:46:58,900 --> 00:47:02,160
because the condition in which we stopped is when the whole list is already sorted, 

778
00:47:02,160 --> 00:47:04,890
meaning we have not made any swaps. 

779
00:47:04,890 --> 00:47:09,960
So that's why our condition down here is 'while some elements have been swapped.'

780
00:47:09,960 --> 00:47:13,720
So now let's just look at this running on a list. 

781
00:47:13,720 --> 00:47:16,640
I have the list 5,0,1,6,4. 

782
00:47:16,640 --> 00:47:19,850
Bubble sort is going to start all the way at the left, and it's going to compare 

783
00:47:19,850 --> 00:47:24,700
the i elements, so 0 to i + 1, which is element 1. 

784
00:47:24,700 --> 00:47:29,020
It's going to say, well 5 > 0, but right now 5 is to the left, 

785
00:47:29,020 --> 00:47:32,500
so I need to swap the 5 and the 0. 

786
00:47:32,500 --> 00:47:35,470
When I swap them, suddenly I get this different list. 

787
00:47:35,470 --> 00:47:38,260
Now 5 > 1, so we're going to swap them. 

788
00:47:38,260 --> 00:47:42,160
5 is not > 6, so we don't need to do anything here. 

789
00:47:42,160 --> 00:47:46,690
But 6 > 4, so we need to swap. 

790
00:47:46,690 --> 00:47:49,740
Again, we need to run through the whole list to eventually discover 

791
00:47:49,740 --> 00:47:52,330
that these are out of order; we swap them, 

792
00:47:52,330 --> 00:47:57,120
and at this point we need to run through the list 1 more time 

793
00:47:57,120 --> 00:48:05,390
to make sure that everything's in its order, and at this point bubble sort has finished. 

794
00:48:05,390 --> 00:48:10,720
A different algorithm for taking some elements and sorting them is selection sort. 

795
00:48:10,720 --> 00:48:15,740
The idea behind selection sort is that we're going to build up a sorted portion of the list

796
00:48:15,740 --> 00:48:18,150
1 element at a time. 

797
00:48:18,150 --> 00:48:23,170
>> And the way we're going to do that is by building up the left segment of the list. 

798
00:48:23,170 --> 00:48:27,510
And basically, every--on each step, we're going to take the smallest element we have left

799
00:48:27,510 --> 00:48:32,310
that hasn't been sorted yet, and we're going to move it into that sorted segment. 

800
00:48:32,310 --> 00:48:35,850
That means we need to continuously find the minimum unsorted element

801
00:48:35,850 --> 00:48:40,720
and then take that minimum element and swap it with whatever 

802
00:48:40,720 --> 00:48:45,090
left-most element that is not sorted. 

803
00:48:45,090 --> 00:48:50,890
The run time of this is going to be O(n²) because in the worst case

804
00:48:50,890 --> 00:48:55,070
we need to compare every single element to every other element. 

805
00:48:55,070 --> 00:48:59,250
Because we're saying that if we start at the left half of the list, we need 

806
00:48:59,250 --> 00:49:02,970
to go through the entire right segment to find the smallest element. 

807
00:49:02,970 --> 00:49:05,430
And then, again, we need to go over the entire right segment and

808
00:49:05,430 --> 00:49:08,210
keep going over that over and over and over again. 

809
00:49:08,210 --> 00:49:11,350
That's going to be n². We're going to need a for loop inside of another for loop 

810
00:49:11,350 --> 00:49:13,350
which suggests n². 

811
00:49:13,350 --> 00:49:16,530
In the best case thought, let's say we give it an already sorted list;

812
00:49:16,530 --> 00:49:19,270
we actually don't do any better than n². 

813
00:49:19,270 --> 00:49:21,730
Because selection sort has no way of knowing that 

814
00:49:21,730 --> 00:49:25,540
the minimum element is just the one I happen to be looking at. 

815
00:49:25,540 --> 00:49:28,970
It still needs to make sure that this is actually the minimum. 

816
00:49:28,970 --> 00:49:31,670
>> And the only way to make sure that it's the minimum, using this algorithm, 

817
00:49:31,670 --> 00:49:34,640
is to look at every single element again. 

818
00:49:34,640 --> 00:49:38,420
So really, if you give it--if you give selection sort an already sorted list, 

819
00:49:38,420 --> 00:49:42,720
it's not going to do any better than giving it a list that isn't sorted yet. 

820
00:49:42,720 --> 00:49:46,320
By the way, if it happens to be the case that something is O(something)

821
00:49:46,320 --> 00:49:50,640
and the omega of something, we can just say more succinctly that it's θ of something. 

822
00:49:50,640 --> 00:49:52,760
So if you see that come up anywhere, that's what that just means. 

823
00:49:52,760 --> 00:49:57,580
>> If something is theta of n², it is both big O(n²) and Ω(n²). 

824
00:49:57,580 --> 00:49:59,790
So best case and worst case, it doesn't make a difference, 

825
00:49:59,790 --> 00:50:04,400
the algorithm is going to do the same thing every time. 

826
00:50:04,400 --> 00:50:06,610
So this is what pseudocode for selection sort could look like. 

827
00:50:06,610 --> 00:50:10,630
We're basically going to say that I want to iterate over the list

828
00:50:10,630 --> 00:50:15,180
from left to right, and at each iteration of the loop, I'm going to move

829
00:50:15,180 --> 00:50:19,780
the minimum element into this sorted portion of the list. 

830
00:50:19,780 --> 00:50:23,260
And once I move something there, I never need to look at that element again. 

831
00:50:23,260 --> 00:50:28,600
Because as soon as I swap an element in to the left segment of the list, it's sorted

832
00:50:28,600 --> 00:50:32,600
because we're doing everything in ascending order by using minimums. 

833
00:50:32,600 --> 00:50:38,740
So we said, okay, we're at position i, and we need to look at all of the elements 

834
00:50:38,740 --> 00:50:42,260
to the right of i in order to find the minimum. 

835
00:50:42,260 --> 00:50:46,150
So that means we want to look from i + 1 to the end of the list. 

836
00:50:46,150 --> 00:50:51,610
And now, if the element that we're currently looking at is less than our minimum so far,

837
00:50:51,610 --> 00:50:54,190
which, remember, we're starting the minimum off to just be

838
00:50:54,190 --> 00:50:57,020
whatever element we're currently at; I'll assume that's the minimum. 

839
00:50:57,020 --> 00:51:00,270
If I find an element that's smaller than that, then I'm going to say, okay, 

840
00:51:00,270 --> 00:51:02,700
well, I have found a new minimum. 

841
00:51:02,700 --> 00:51:06,080
I'm going to remember where that minimum was. 

842
00:51:06,080 --> 00:51:09,560
>> So now, once I've gone through that right unsorted segment, 

843
00:51:09,560 --> 00:51:16,690
I can say I'm going to swap the minimum element with the element that is in position i. 

844
00:51:16,690 --> 00:51:21,100
That's going to build up my list, my sorted portion of the list from left to right, 

845
00:51:21,100 --> 00:51:25,190
and we don't ever need to look at an element again once it's in that portion. 

846
00:51:25,190 --> 00:51:27,930
Once we've swapped it. 

847
00:51:27,930 --> 00:51:30,260
So let's run selection sort on this list. 

848
00:51:30,260 --> 00:51:38,220
The blue element here is going to be the i, and the red element is going to be the minimum element. 

849
00:51:38,220 --> 00:51:41,570
So i starts all the way at the left of the list, so at 5. 

850
00:51:41,570 --> 00:51:44,610
Now we need to find the minimum unsorted element. 

851
00:51:44,610 --> 00:51:49,480
So we say 0 < 5, so 0 is my new minimum. 

852
00:51:49,480 --> 00:51:53,820
>> But I can't stop there, because even though we can recognize that 0 is the smallest, 

853
00:51:53,820 --> 00:51:59,390
we need to run through every other element of the list to make sure. 

854
00:51:59,390 --> 00:52:01,760
So 1 is bigger, 6 is bigger, 4 is bigger. 

855
00:52:01,760 --> 00:52:05,850
That means that after looking at all of these elements, I've determined 0 is the smallest. 

856
00:52:05,850 --> 00:52:09,800
So I'm going to swap the 5 and the 0. 

857
00:52:09,800 --> 00:52:15,480
Once I swap that, I'm going to get a new list, and I know that I never need to look at that 0 again

858
00:52:15,480 --> 00:52:19,380
because once I've swapped it, I've sorted it and we're done. 

859
00:52:19,380 --> 00:52:22,730
Now it just so happens that the blue element is again the 5, 

860
00:52:22,730 --> 00:52:26,030
and we need to look at the 1, the 6 and the 4 to determine that 1 

861
00:52:26,030 --> 00:52:31,520
is the smallest minimum element, so we'll swap the 1 and the 5. 

862
00:52:31,520 --> 00:52:36,890
Again, we need to look at--compare the 5 to the 6 and the 4, 

863
00:52:36,890 --> 00:52:39,830
and we're going to swap the 4 and the 5, and finally, compare 

864
00:52:39,830 --> 00:52:45,740
those 2 numbers and swap them until we get our sorted list. 

865
00:52:45,740 --> 00:52:49,730
Any questions on selection sort? 

866
00:52:49,730 --> 00:52:56,420
Okay. Let's move to the last topic here, and that is recursion. 

867
00:52:56,420 --> 00:52:59,810
>> Recursion, remember, is this really meta thing where a function 

868
00:52:59,810 --> 00:53:02,740
repeatedly calls itself. 

869
00:53:02,740 --> 00:53:05,620
So at some point, while our fuction is repeatedly calling itself, 

870
00:53:05,620 --> 00:53:10,100
there needs to be some point at which we stop calling ourselves. 

871
00:53:10,100 --> 00:53:13,670
Because if we don't do that, then we're just going to continue to do this forever, 

872
00:53:13,670 --> 00:53:16,660
and our program is just not going to terminate. 

873
00:53:16,660 --> 00:53:19,200
We call this condition the base case. 

874
00:53:19,200 --> 00:53:22,570
And the base case says, rather than calling a function again, 

875
00:53:22,570 --> 00:53:25,330
I'm just going to return some value. 

876
00:53:25,330 --> 00:53:28,080
So once we've returned a value, we've stopped calling ourselves, 

877
00:53:28,080 --> 00:53:32,550
and the rest of the calls we've made so far can also return. 

878
00:53:32,550 --> 00:53:36,050
The opposite of the base case is the recursive case. 

879
00:53:36,050 --> 00:53:39,050
And this is when we want to make another call to the function that we're currently in. 

880
00:53:39,050 --> 00:53:44,690
And we probably, although not always, want to use different arguments. 

881
00:53:44,690 --> 00:53:48,940
>> So if we have a function called f, and f just called take 1 argument, 

882
00:53:48,940 --> 00:53:52,010
and we just keep calling f(1), f(1), f(1), and it just so happens that 

883
00:53:52,010 --> 00:53:56,510
the argument 1 falls into recursive case, we're still never going to stop. 

884
00:53:56,510 --> 00:54:01,620
Even if we have a base case, we need to make sure that eventually we're going to hit that base case. 

885
00:54:01,620 --> 00:54:04,250
We don't just keep staying in this recursive case. 

886
00:54:04,250 --> 00:54:09,870
Generally, when we call ourselves, we're probably going to have a different argument each time. 

887
00:54:09,870 --> 00:54:12,700
Here is a really simple recursive function. 

888
00:54:12,700 --> 00:54:15,090
So this will compute the factorial of a number. 

889
00:54:15,090 --> 00:54:17,790
Up top here we have our base case. 

890
00:54:17,790 --> 00:54:22,330
In the case that n ≤ 1, we're not going to call factorial again. 

891
00:54:22,330 --> 00:54:26,490
We're going to stop; we're just going to return some value. 

892
00:54:26,490 --> 00:54:30,170
If this is not true, then we're going to hit our recursive case. 

893
00:54:30,170 --> 00:54:33,550
Notice here that we're not just calling factorial(n), because that wouldn't be very helpful. 

894
00:54:33,550 --> 00:54:36,810
We're going to call factorial of something else. 

895
00:54:36,810 --> 00:54:40,850
>> And so you can see, eventually if we pass a factorial (5) or something,

896
00:54:40,850 --> 00:54:45,900
we're going to call factorial (4) and so on, and eventually we're going to hit this base case. 

897
00:54:45,900 --> 00:54:51,730
So this looks good. Let's see what happens when we actually run this. 

898
00:54:51,730 --> 00:54:57,840
This is the stack, and let's say that main is going to call this function with an argument (4). 

899
00:54:57,840 --> 00:55:02,200
So once factorial sees and = 4, factorial will call itself. 

900
00:55:02,200 --> 00:55:05,010
Now, suddenly, we have factorial(3). 

901
00:55:05,010 --> 00:55:10,780
So these functions are going to keep growing until eventually we hit our base case. 

902
00:55:10,780 --> 00:55:17,830
At this point, the return value of this is the return(n x the return value of this), 

903
00:55:17,830 --> 00:55:21,290
the return value of this is n x the return value of this. 

904
00:55:21,290 --> 00:55:23,290
Eventually we need to hit some number. 

905
00:55:23,290 --> 00:55:26,560
At the top here, we say return 1. 

906
00:55:26,560 --> 00:55:30,650
That means that once we return that number, we can pop this off the stack. 

907
00:55:30,650 --> 00:55:36,570
So this factorial(1) is done. 

908
00:55:36,570 --> 00:55:41,190
When 1 returns, this factorial(1) returns, this return to 1. 

909
00:55:41,190 --> 00:55:46,910
The return value of this, remember, was n x the return value of this. 

910
00:55:46,910 --> 00:55:50,720
So suddenly, this guy knows that I want to return 2. 

911
00:55:50,720 --> 00:55:55,910
>> So remember, return value of this is just n x the return value up here. 

912
00:55:55,910 --> 00:56:01,160
So now we can say 3 x 2, and finally, here we can say

913
00:56:01,160 --> 00:56:04,010
this is just going to be 4 x 3 x 2. 

914
00:56:04,010 --> 00:56:09,570
And once this returns, we get down to a single integer inside of main. 

915
00:56:09,570 --> 00:56:15,460
Any questions on recursion? 

916
00:56:15,460 --> 00:56:17,090
All right. So there's more time for questions at the end,

917
00:56:17,090 --> 00:56:23,360
but now Joseph will cover the remaining topics. 

918
00:56:23,360 --> 00:56:25,590
>> [Joseph Ong] All right. So now that we've talked about recursions, 

919
00:56:25,590 --> 00:56:27,840
let's talk a little bit about what merge sort is. 

920
00:56:27,840 --> 00:56:31,740
Merge sort is basically another way of sorting a list of numbers. 

921
00:56:31,740 --> 00:56:36,430
And how it works is, with merge sort you have a list, and what we do is

922
00:56:36,430 --> 00:56:39,120
we say, let's split this into 2 halves. 

923
00:56:39,120 --> 00:56:42,750
We'll first run merge sort again on the left half, 

924
00:56:42,750 --> 00:56:45,040
then we'll run merge sort on the right half, 

925
00:56:45,040 --> 00:56:50,240
and that gives us now 2 halves that are sorted, and now we're going to combine those halves together. 

926
00:56:50,240 --> 00:56:55,010
It's a bit hard to see without an example, so we'll go through the motions and see what happens. 

927
00:56:55,010 --> 00:56:59,590
So you start with this list, we split it into 2 halves. 

928
00:56:59,590 --> 00:57:02,300
We run merge sort on the left half first. 

929
00:57:02,300 --> 00:57:06,660
So that's the left half, and now we run them through this list again

930
00:57:06,660 --> 00:57:09,800
which gets passed into merge sort, and then we look, again, 

931
00:57:09,800 --> 00:57:13,270
at the left side of this list and we run merge sort on it. 

932
00:57:13,270 --> 00:57:15,880
Now, we get down to a list of 2 numbers, 

933
00:57:15,880 --> 00:57:19,010
and now the left half is only 1 element long, and we can't 

934
00:57:19,010 --> 00:57:23,380
split a list that's only 1 element into half, so we just say, once we have 50,

935
00:57:23,380 --> 00:57:26,400
which is just 1 element, it's already sorted. 

936
00:57:26,400 --> 00:57:29,860
>> Once we're done with that, we can see that we can 

937
00:57:29,860 --> 00:57:32,230
move on to the right half of this list, 

938
00:57:32,230 --> 00:57:36,480
and 3 is also sorted, and so now that both halves of this list are sorted

939
00:57:36,480 --> 00:57:39,080
we can join these numbers back together. 

940
00:57:39,080 --> 00:57:45,320
So we look at 50 and 3; 3 is smaller than 50, so it goes in first and then 50 comes in. 

941
00:57:45,320 --> 00:57:49,340
Now, that's done; we go back up to that list and sort it's right half. 

942
00:57:49,340 --> 00:57:52,440
42 is it's own number, so it's already sorted. 

943
00:57:52,440 --> 00:57:57,850
So now we compare these 2 and 3 is smaller than 42, so that gets put in first, 

944
00:57:57,850 --> 00:58:02,340
now 42 gets put in, and 50 gets put in. 

945
00:58:02,340 --> 00:58:07,220
Now, that's sorted, we go all the way back to the top, 1337 and 15. 

946
00:58:07,220 --> 00:58:14,560
Well, we now look at the left half of this list; 1337 is by itself so it's sorted and same with 15. 

947
00:58:14,560 --> 00:58:19,020
So now we combine these 2 numbers to sort that original list, 15 < 1337, 

948
00:58:19,020 --> 00:58:23,060
so it goes in first, then 1337 goes in. 

949
00:58:23,060 --> 00:58:26,640
And now we sorted both halves of the original list up top. 

950
00:58:26,640 --> 00:58:30,440
And all we have to do is combine these. 

951
00:58:30,440 --> 00:58:36,890
We look at the first 2 numbers of this list, 3 < 15, so it goes into the sort array first. 

952
00:58:36,890 --> 00:58:44,460
15 < 42, so it goes in. Now, 42 < 1337, that goes in. 

953
00:58:44,460 --> 00:58:51,010
50 < 1337, so it goes in. And notice that we just took 2 numbers off of this list. 

954
00:58:51,010 --> 00:58:53,640
So we're not just alternating between the 2 lists. 

955
00:58:53,640 --> 00:58:56,050
We're just looking at the beginning, and we're taking the element

956
00:58:56,050 --> 00:59:00,270
that's smaller and then putting it into our array. 

957
00:59:00,270 --> 00:59:04,080
Now we've merged all the halves and we're done. 

958
00:59:04,080 --> 00:59:07,780
>> Any questions about merge sort? Yes? 

959
00:59:07,780 --> 00:59:14,190
[Student] If it's splitting into different groups, why don't they just split it once

960
00:59:14,190 --> 00:59:19,970
and you have 3 and 2 in a group? [Rest of question unintelligible]

961
00:59:19,970 --> 00:59:24,940
The reason--so the question is, why can't we just merge them at that first step after we have them?

962
00:59:24,940 --> 00:59:29,530
The reason we can do this, start at the left-most elements of both sides,

963
00:59:29,530 --> 00:59:33,040
and then take the smaller one and put it in, is that we know that these 

964
00:59:33,040 --> 00:59:35,290
individual lists are in sorted orders. 

965
00:59:35,290 --> 00:59:37,290
So if I'm looking at the left-most elements of both halves, 

966
00:59:37,290 --> 00:59:40,490
I know they're going to be the smallest elements of those lists. 

967
00:59:40,490 --> 00:59:43,930
So I can put them into the smallest element spots of this large list. 

968
00:59:43,930 --> 00:59:47,810
On the other hand, if I look at those 2 lists in the second level over there, 

969
00:59:47,810 --> 00:59:51,640
50, 3, 42, 1337 and 15, those aren't sorted. 

970
00:59:51,640 --> 00:59:55,770
So if I look at 50 and 1337, I'm going to put 50 into my list first. 

971
00:59:55,770 --> 01:00:00,130
But that doesn't really make sense, because 3 is the smallest element out of all of those. 

972
01:00:00,130 --> 01:00:04,390
So the only reason we can do this combining step is because our lists are already sorted. 

973
01:00:04,390 --> 01:00:07,010
Which is why we have to get down all the way to the bottom

974
01:00:07,010 --> 01:00:09,800
because when we have just a single number, you know that a single number 

975
01:00:09,800 --> 01:00:14,120
in and of itself is already a sorted list. 

976
01:00:14,120 --> 01:00:19,360
>> Any questions? No?

977
01:00:19,360 --> 01:00:24,260
Complexity? Well, you can see that at each step there's end numbers, 

978
01:00:24,260 --> 01:00:27,590
and we can divide a list in half log n times, 

979
01:00:27,590 --> 01:00:31,700
which is where we get this n x log n complexity. 

980
01:00:31,700 --> 01:00:34,940
And you'll see the best case for merge sort is n log n, and it just so happens 

981
01:00:34,940 --> 01:00:39,340
that the worst case, or the Ω over there, is also n log n. 

982
01:00:39,340 --> 01:00:42,480
Something to keep in mind. 

983
01:00:42,480 --> 01:00:45,750
Moving on, let's go on to some super basic file I/O. 

984
01:00:45,750 --> 01:00:48,830
If you looked at Scramble, you'll notice we had some sort of system 

985
01:00:48,830 --> 01:00:51,270
where you could write to a log file if you read through the code. 

986
01:00:51,270 --> 01:00:53,730
Let's see how you might do that. 

987
01:00:53,730 --> 01:00:57,450
Well, we have fprintf, which you can think of as just printf, 

988
01:00:57,450 --> 01:01:01,720
but just printing to a file instead, and hence the f at the beginning. 

989
01:01:01,720 --> 01:01:07,570
This sort of code up here, what it does is, as you might have seen in Scramble, 

990
01:01:07,570 --> 01:01:12,310
it goes through your 2-dimensional array printing out row by row what the numbers are. 

991
01:01:12,310 --> 01:01:17,850
In this case, printf prints out to your terminal or what we call the standard output of section. 

992
01:01:17,850 --> 01:01:22,170
>> And now, in this case, all we have to do is replace printf with fprintf, 

993
01:01:22,170 --> 01:01:26,770
tell it what file you want to print to, and in this case it just prints it out to that file 

994
01:01:26,770 --> 01:01:32,230
instead of printing it out to your terminal. 

995
01:01:32,230 --> 01:01:36,500
Well, then that begs the question: Where do we get this sort of file from, right? 

996
01:01:36,500 --> 01:01:39,840
We passed log in to this fprintf fuction but we had no idea where it came from. 

997
01:01:39,840 --> 01:01:43,980
Well, early in the code, what we had was this chunk of code over here, 

998
01:01:43,980 --> 01:01:48,340
which basically says that open the file calls log.txt. 

999
01:01:48,340 --> 01:01:53,220
What we do after that is we have to make sure that the file is actually opened successfully. 

1000
01:01:53,220 --> 01:01:57,070
So it might fail for multiple reasons; you don't have enough space on your computer, for example. 

1001
01:01:57,070 --> 01:01:59,790
So it's always important before you do any operations with the file 

1002
01:01:59,790 --> 01:02:03,300
that we check whether that file was opened successfully. 

1003
01:02:03,300 --> 01:02:09,330
So what that a, that's an argument to fopen, well, we can open a file in many ways. 

1004
01:02:09,330 --> 01:02:13,510
What we can do is, we can pass it w, which means override the file if it exits already, 

1005
01:02:13,510 --> 01:02:18,070
We can pass an a, which they append to the end of the file instead of overriding it, 

1006
01:02:18,070 --> 01:02:22,730
or we can specify r, which means, let's open the file as read-only. 

1007
01:02:22,730 --> 01:02:24,890
So if the program tries to make any changes to the file, 

1008
01:02:24,890 --> 01:02:30,140
yell at them and don't let them do it. 

1009
01:02:30,140 --> 01:02:33,320
Finally, once we're done with the file, done doing operations on it, 

1010
01:02:33,320 --> 01:02:35,860
we need to make sure we close the file. 

1011
01:02:35,860 --> 01:02:38,830
And so at the end of your program, you are going to pass them again 

1012
01:02:38,830 --> 01:02:42,120
this file that you opened, and just close it. 

1013
01:02:42,120 --> 01:02:44,650
So this is something important that you have to make sure you do. 

1014
01:02:44,650 --> 01:02:47,180
So remember you can open a file, then you can write to the file, 

1015
01:02:47,180 --> 01:02:51,270
do operations in the file, but then you have to close the file at the end. 

1016
01:02:51,270 --> 01:02:53,270
>> Any questions on basic file I/O? Yes?

1017
01:02:53,270 --> 01:02:58,050
[Student question, unintelligible] 

1018
01:02:58,050 --> 01:03:02,480
Right here. The question is, where does this log.txt file appear? 

1019
01:03:02,480 --> 01:03:07,890
Well, if you just give it log.txt, it creates it in the same directory as the executable. 

1020
01:03:07,890 --> 01:03:10,500
So if you're-- >>[Student question, unintelligible]

1021
01:03:10,500 --> 01:03:18,830
Yes. In the same folder, or in the same directory, as you call it. 

1022
01:03:18,830 --> 01:03:21,400
Now memory, stack, and heap. 

1023
01:03:21,400 --> 01:03:23,400
So how is memory laid out in the computer? 

1024
01:03:23,400 --> 01:03:26,270
Well, you can imagine memory as sort of this block here.

1025
01:03:26,270 --> 01:03:30,260
And in memory we have what's called the heap stuck over there, and the stack that's down there. 

1026
01:03:30,260 --> 01:03:34,480
And the heap grows downward and the stack grows upward. 

1027
01:03:34,480 --> 01:03:38,620
So as Tommy mentioned--oh, well, and we have these other 4 segments which I'll get to in a second--

1028
01:03:38,620 --> 01:03:42,890
As Tommy said earlier, you know how his functions call themselves and call each other? 

1029
01:03:42,890 --> 01:03:44,930
They build up this sort of stack frame.

1030
01:03:44,930 --> 01:03:47,360
Well, if main calls foo, foo gets put on the stack. 

1031
01:03:47,360 --> 01:03:52,430
Foo calls bar, bar get's put on the stack, and that gets put on the stack after. 

1032
01:03:52,430 --> 01:03:57,040
And as they return, they each get taken off the stack. 

1033
01:03:57,040 --> 01:04:00,140
What do each of these locations and memory hold? 

1034
01:04:00,140 --> 01:04:03,110
Well, the top, which is the text segment, contains the program itself. 

1035
01:04:03,110 --> 01:04:06,390
So the machine code, that's there, once you compile your program. 

1036
01:04:06,390 --> 01:04:08,520
Next, any initialized global variables. 

1037
01:04:08,520 --> 01:04:12,660
>> So you have global variables in your program, and you say like, a = 5, 

1038
01:04:12,660 --> 01:04:15,260
that gets put in that segment, and right under that, 

1039
01:04:15,260 --> 01:04:18,990
you have any uninitialized global data, which is just int a, 

1040
01:04:18,990 --> 01:04:20,990
but you don't say it's equal to anything. 

1041
01:04:20,990 --> 01:04:23,870
Realize these are global variables, so they're outside of main. 

1042
01:04:23,870 --> 01:04:28,560
So this means any global variables that are declared but are not initialized. 

1043
01:04:28,560 --> 01:04:32,310
So what's in the heap? Memory allocated using malloc, which we'll get to in a little bit.

1044
01:04:32,310 --> 01:04:35,990
And finally, with the stack you have any local variables

1045
01:04:35,990 --> 01:04:39,950
and any functions you might call in any of their parameters. 

1046
01:04:39,950 --> 01:04:43,720
The last thing, you don't really have to know what the environment variables do, 

1047
01:04:43,720 --> 01:04:46,700
but whenever you run program, there is something associated, like

1048
01:04:46,700 --> 01:04:49,550
this is the username of the person who ran the program. 

1049
01:04:49,550 --> 01:04:51,550
And that's going to be sort of at the bottom. 

1050
01:04:51,550 --> 01:04:54,540
In terms of memory addresses, which are hexadecimal values, 

1051
01:04:54,540 --> 01:04:58,170
the values at the top start at 0, and they go all the way down to the bottom. 

1052
01:04:58,170 --> 01:05:00,440
In this case, if you're on the 32-bit system, 

1053
01:05:00,440 --> 01:05:05,390
the address at the bottom is going to be 0x, followed by af, because that's 32 bits, 

1054
01:05:05,390 --> 01:05:10,890
which is 8 bytes, and in this case 8 bytes corresponds to 8 hexadecimal digits. 

1055
01:05:10,890 --> 01:05:20,110
So down here you're going to have, like, 0xffffff, and up there you're going to have 0.

1056
01:05:20,110 --> 01:05:23,660
So what are pointers? Some of you may not have covered this in section before. 

1057
01:05:23,660 --> 01:05:26,660
but we did go over it in lecture, so a pointer is just a data type 

1058
01:05:26,660 --> 01:05:34,030
which stores, instead of some sort of value like 50, it stores the address of some location in memory. 

1059
01:05:34,030 --> 01:05:36,020
Like that memory [unintelligible]. 

1060
01:05:36,020 --> 01:05:41,120
So in this case, what we have is, we have a pointer to an integer or an int *, 

1061
01:05:41,120 --> 01:05:46,210
and it contains this hexadecimal address of 0xDEADBEEF. 

1062
01:05:46,210 --> 01:05:50,880
>> So what we have is, now, this pointer points at some location in memory, 

1063
01:05:50,880 --> 01:05:56,020
and that's just a, the value 50 is at this memory location. 

1064
01:05:56,020 --> 01:06:01,810
On some 32-bit systems, on all 32-bit systems, pointers take up 32 bits or 4 bytes. 

1065
01:06:01,810 --> 01:06:06,020
But, for example, on a 64-bit system, pointers are 64 bits. 

1066
01:06:06,020 --> 01:06:08,040
So that's something you'll want to keep in mind. 

1067
01:06:08,040 --> 01:06:12,310
So on an end-bit system, a pointer is end bits long. 

1068
01:06:12,310 --> 01:06:17,320
Pointers are sort of hard to digest without extra things, 

1069
01:06:17,320 --> 01:06:20,300
so let's go through an example of dynamic memory allocation. 

1070
01:06:20,300 --> 01:06:25,130
What dynamic memory allocation does for you, or what we call malloc, 

1071
01:06:25,130 --> 01:06:29,280
it lets you allocate some sort of data outside of the set. 

1072
01:06:29,280 --> 01:06:31,830
So this data is sort of more permanent for the duration of the program. 

1073
01:06:31,830 --> 01:06:36,430
Because as you know, if you declare x inside of a function, and that function returns, 

1074
01:06:36,430 --> 01:06:40,910
you no longer have access to the data that was stored in x.

1075
01:06:40,910 --> 01:06:44,420
What pointers let us do is they let us store memory or store values 

1076
01:06:44,420 --> 01:06:46,840
in a different segment of memory, namely the heap. 

1077
01:06:46,840 --> 01:06:49,340
Now once we return out of function, as long as we have a pointer 

1078
01:06:49,340 --> 01:06:54,960
to that location in memory, then what we can do is we can just look at the values there. 

1079
01:06:54,960 --> 01:06:58,020
Let's look at an example: This is our memory layout again. 

1080
01:06:58,020 --> 01:07:00,050
And we have this function, main. 

1081
01:07:00,050 --> 01:07:06,870
What it does is--okay, so simple, right?--int x = 5, that's just a variable on the stack in main. 

1082
01:07:06,870 --> 01:07:12,450
>> On the other hand, now we declare a pointer which calls the function giveMeThreeInts.

1083
01:07:12,450 --> 01:07:16,800
And so now we go into this function and we create a new stack frame for it. 

1084
01:07:16,800 --> 01:07:20,440
However, in this stack frame, we declare int * temp,

1085
01:07:20,440 --> 01:07:23,210
which in mallocs 3 integers for us.

1086
01:07:23,210 --> 01:07:25,880
So size of int will give us how many bytes this int is, 

1087
01:07:25,880 --> 01:07:29,620
and malloc gives us that many bytes of space on the heap. 

1088
01:07:29,620 --> 01:07:32,890
So in this case, we have created enough space for 3 integers, 

1089
01:07:32,890 --> 01:07:36,830
and the heap is way up there, which is why I've drawn it higher up. 

1090
01:07:36,830 --> 01:07:42,900
Once we're done, we come back up here, you only need 3 ints returned, 

1091
01:07:42,900 --> 01:07:47,000
and it returns the address, in this case over where that memory is. 

1092
01:07:47,000 --> 01:07:51,250
And we set pointer = switch, and up there we have just another pointer. 

1093
01:07:51,250 --> 01:07:54,550
But what that function returns is stacked here and disappears. 

1094
01:07:54,550 --> 01:07:59,250
So temp disappears, but we still maintain the address of where 

1095
01:07:59,250 --> 01:08:01,850
those 3 integers are inside of mains. 

1096
01:08:01,850 --> 01:08:06,180
So in this set, the pointers are scoped locally for the stacked frame, 

1097
01:08:06,180 --> 01:08:09,860
but the memory to which they refer is in the heap. 

1098
01:08:09,860 --> 01:08:12,190
>> Does that make sense? 

1099
01:08:12,190 --> 01:08:14,960
[Student] Could you repeat that?                          >>[Joseph] Yes. 

1100
01:08:14,960 --> 01:08:20,270
So if I go back just a little bit, you see that temp allocated 

1101
01:08:20,270 --> 01:08:23,500
some memory on the heap up there. 

1102
01:08:23,500 --> 01:08:28,680
So when this function, giveMeThreeInts returns, this stack here is going to disappear. 

1103
01:08:28,680 --> 01:08:35,819
And with it any of the variables, in this case, this pointer that was allocated in stacked frame. 

1104
01:08:35,819 --> 01:08:39,649
That is going to disappear, but since we returned temp 

1105
01:08:39,649 --> 01:08:46,330
and we set pointer = temp, pointer's now going to point the same memory of location as temp was. 

1106
01:08:46,330 --> 01:08:50,370
So now, even though we lose temp, that local pointer, 

1107
01:08:50,370 --> 01:08:59,109
we still retain the memory address of what it was pointing to inside of that variable pointer. 

1108
01:08:59,109 --> 01:09:03,740
Questions? That can be kind of a confusing topic if you haven't gone over it in section. 

1109
01:09:03,740 --> 01:09:09,240
We can, your TF will definitely go over it and of course we can answer questions 

1110
01:09:09,240 --> 01:09:11,500
at the end of the review session for this. 

1111
01:09:11,500 --> 01:09:14,220
But this is sort of a complex topic, and I have more examples that are going to show up

1112
01:09:14,220 --> 01:09:18,790
that will help clarify what pointers actually are. 

1113
01:09:18,790 --> 01:09:22,500
>> In this case, pointers are equivalent to arrays,

1114
01:09:22,500 --> 01:09:25,229
so I can just use this pointer as the same thing as an int array. 

1115
01:09:25,229 --> 01:09:29,840
So I'm indexing into 0, and changing the first integer to 1, 

1116
01:09:29,840 --> 01:09:39,689
changing the second integer to 2, and the 3rd integer to 3. 

1117
01:09:39,689 --> 01:09:44,210
So more on pointers. Well, recall Binky. 

1118
01:09:44,210 --> 01:09:48,319
In this case we've allocated a pointer, or we declared a pointer, 

1119
01:09:48,319 --> 01:09:52,760
but initially, when I just declared a pointer, it's not pointing to anywhere in memory. 

1120
01:09:52,760 --> 01:09:54,930
It's just garbage values inside of it. 

1121
01:09:54,930 --> 01:09:56,470
So I have no idea where this pointer is pointing to. 

1122
01:09:56,470 --> 01:10:01,630
It has an address which is just filled with 0's and 1's where it was initially declared. 

1123
01:10:01,630 --> 01:10:04,810
I can't do anything with this until I call malloc on it

1124
01:10:04,810 --> 01:10:08,390
and then it gives me a little space on the heap where I can put values inside. 

1125
01:10:08,390 --> 01:10:11,980
Then again, I don't know what's inside of this memory.

1126
01:10:11,980 --> 01:10:16,780
So the first thing I have to do is check whether the system had enough memory 

1127
01:10:16,780 --> 01:10:20,850
to give me back 1 integer in the first place, which is why I'm doing this check. 

1128
01:10:20,850 --> 01:10:25,020
If pointer is null, that means that it didn't have enough space or some other error occurred, 

1129
01:10:25,020 --> 01:10:26,320
so I should exit out of my program.

1130
01:10:26,320 --> 01:10:29,400
 But if it did succeed, now I can use that pointer

1131
01:10:29,400 --> 01:10:35,020
and what * pointer does is it follows where the address is

1132
01:10:35,020 --> 01:10:38,480
to where that value is, and it sets it equal to 1. 

1133
01:10:38,480 --> 01:10:41,850
So over here, we're checking if that memory existed. 

1134
01:10:41,850 --> 01:10:45,380
>> Once you know it exists, you can put into it

1135
01:10:45,380 --> 01:10:50,460
what value you want to put into it; in this case 1. 

1136
01:10:50,460 --> 01:10:53,060
Once we're done with it, you need to free that pointer

1137
01:10:53,060 --> 01:10:57,160
because we need to get back to the system that memory that you asked for in the first place. 

1138
01:10:57,160 --> 01:10:59,690
Because the computer doesn't know when we're done with it. 

1139
01:10:59,690 --> 01:11:02,510
In this case we're explicitly telling it, okay, we're done with that memory. 

1140
01:11:02,510 --> 01:11:10,780
If some other process needs it, some other program needs it, feel free to go ahead and take it. 

1141
01:11:10,780 --> 01:11:15,110
What we can also do is we can just get the address of local variables on the set. 

1142
01:11:15,110 --> 01:11:19,080
So int x is inside the stacked frame of main. 

1143
01:11:19,080 --> 01:11:23,060
And when we use this ampersand, this and operator, what it does is 

1144
01:11:23,060 --> 01:11:27,310
it takes x, and x is just some data in memory, but it has an address. 

1145
01:11:27,310 --> 01:11:33,790
It's located somewhere. So by calling & x, what this does is it gives us the address of x. 

1146
01:11:33,790 --> 01:11:38,430
By doing this, we're making pointer point to where x is in memory. 

1147
01:11:38,430 --> 01:11:41,710
Now we just do something like *x, we're going to get 5 back. 

1148
01:11:41,710 --> 01:11:43,820
The star is called dereferencing it. 

1149
01:11:43,820 --> 01:11:46,640
You follow the address and you get the value of it stored there. 

1150
01:11:51,000 --> 01:11:53,310
>> Any questions? Yes? 

1151
01:11:53,310 --> 01:11:56,500
[Student] If you don't do the 3-pointed thing, does it still compile? 

1152
01:11:56,500 --> 01:11:59,490
Yes. If you don't do the 3-pointer thing, it's still going to compile,

1153
01:11:59,490 --> 01:12:02,720
but I'll show you what happens in a second, and without doing that, 

1154
01:12:02,720 --> 01:12:04,860
that's what we call a memory leak. You're not giving the system 

1155
01:12:04,860 --> 01:12:07,850
back its memory, so after a while the program is going to accumulate 

1156
01:12:07,850 --> 01:12:10,940
memory that it's not using, and nothing else can use it. 

1157
01:12:10,940 --> 01:12:15,750
If you've ever seen Firefox with 1.5 million kilobytes on your computer, 

1158
01:12:15,750 --> 01:12:17,840
in the task manager, that's what's going on. 

1159
01:12:17,840 --> 01:12:20,760
You have a memory leak in the program that they're not handling. 

1160
01:12:23,080 --> 01:12:26,240
So how does pointer arithmetic work? 

1161
01:12:26,240 --> 01:12:29,480
Well, pointer arithmetic is sort of like indexing into an array. 

1162
01:12:29,480 --> 01:12:36,370
In this case, I have a pointer, and what I do is I make pointer point to the first element 

1163
01:12:36,370 --> 01:12:42,100
of this array of 3 integers that I've allocated. 

1164
01:12:42,100 --> 01:12:46,670
So now what I do, star pointer just changes the first element in the list. 

1165
01:12:46,670 --> 01:12:49,140
Star pointer +1 points over here. 

1166
01:12:49,140 --> 01:12:53,140
So pointer is over here, pointer +1 is over here, pointer +2 is over here. 

1167
01:12:53,140 --> 01:12:56,610
>> So just adding 1 is the same thing as moving along this array. 

1168
01:12:56,610 --> 01:12:59,880
What we do is, when we do pointer +1 you get the address over here, 

1169
01:12:59,880 --> 01:13:04,180
and in order to get the value in here, you put a star in from the entire expression

1170
01:13:04,180 --> 01:13:05,990
to dereference it.

1171
01:13:05,990 --> 01:13:09,940
So, in this case, I'm setting the first location in this array to 1,

1172
01:13:09,940 --> 01:13:13,970
second location to 2, and third location to 3.

1173
01:13:13,970 --> 01:13:18,180
Then what I'm doing over here is I'm printing our pointer +1,

1174
01:13:18,180 --> 01:13:19,970
which just gives me 2.

1175
01:13:19,970 --> 01:13:23,650
Now I'm incrementing pointer, so pointer equals pointer +1,

1176
01:13:23,650 --> 01:13:26,780
which moves it forward.

1177
01:13:26,780 --> 01:13:30,810
And so now if I print out pointer +1, pointer +1 is now 3,

1178
01:13:30,810 --> 01:13:33,990
which in this case prints out 3.

1179
01:13:33,990 --> 01:13:36,560
And in order to free something, the pointer that I give it

1180
01:13:36,560 --> 01:13:40,540
must be pointing at the beginning of the array which I got back from malloc.

1181
01:13:40,540 --> 01:13:43,430
So, in this case, if I were to call 3 right here, this wouldn't be right,

1182
01:13:43,430 --> 01:13:45,070
because it's in the middle of the array.

1183
01:13:45,070 --> 01:13:48,820
I have to subtract to get to the original location

1184
01:13:48,820 --> 01:13:50,420
the initial first spot before I can free it.

1185
01:13:56,300 --> 01:13:58,450
So, here's a more involved example.

1186
01:13:58,450 --> 01:14:03,360
In this case, we're allocating 7 characters in a character array.

1187
01:14:03,360 --> 01:14:06,480
>> And in this case what we're doing is we're looping over the first 6 of them,

1188
01:14:06,480 --> 01:14:09,900
and we're setting them to Z.

1189
01:14:09,900 --> 01:14:13,350
So, for int i = 0, i > 6, i ++, 

1190
01:14:13,350 --> 01:14:16,220
So, pointer +i will just give us, in this case, 

1191
01:14:16,220 --> 01:14:20,860
pointer, pointer +1, pointer +2, pointer +3, and so on and so forth in the loop.

1192
01:14:20,860 --> 01:14:24,040
What it's going to do is it gets that address, dereferences it to get the value,

1193
01:14:24,040 --> 01:14:27,440
and changes that value to a Z.

1194
01:14:27,440 --> 01:14:30,350
Then at the end remember this is a string, right?

1195
01:14:30,350 --> 01:14:33,560
All strings have to end with the null terminating character.

1196
01:14:33,560 --> 01:14:38,620
So, what I do is in pointer 6 I put the null terminator character in.

1197
01:14:38,620 --> 01:14:43,980
And now what I'm basically doing over here is implementing printf for a string, right?

1198
01:14:43,980 --> 01:14:46,190
>> So, when does printf now when it's reached the end of a string?

1199
01:14:46,190 --> 01:14:48,230
When it hits the null terminating character.

1200
01:14:48,230 --> 01:14:52,030
So, in this case, my original pointer points to the beginning of this array.

1201
01:14:52,030 --> 01:14:56,410
I print the first character out. I move it over one.

1202
01:14:56,410 --> 01:14:58,420
I print that character out. I move it over.

1203
01:14:58,420 --> 01:15:02,180
And I keep doing this until I reach the end.

1204
01:15:02,180 --> 01:15:07,750
And now the end *pointer will dereference this and get the null terminating character back.

1205
01:15:07,750 --> 01:15:11,780
And so my while loop runs only when that value is not the null terminating character.

1206
01:15:11,780 --> 01:15:13,770
So, now I exit out of this loop.

1207
01:15:18,780 --> 01:15:21,180
And so if I subtract 6 from this pointer,

1208
01:15:21,180 --> 01:15:22,860
I go back all the way to the beginning.

1209
01:15:22,860 --> 01:15:27,880
Remember, I'm doing this because I have to go to the beginning in order to free it.

1210
01:15:27,880 --> 01:15:30,270
>> So, I know that was a lot. Are there any questions?

1211
01:15:30,270 --> 01:15:31,870
Please, yes?

1212
01:15:31,870 --> 01:15:36,610
[Student question unintelligible]

1213
01:15:36,610 --> 01:15:38,190
Can you say that louder? Sorry.

1214
01:15:38,190 --> 01:15:44,140
[Student] On the last slide right before you freed the pointer,

1215
01:15:44,140 --> 01:15:47,300
where were you actually changing the value of the pointer?

1216
01:15:47,300 --> 01:15:50,370
[Joseph] So, right here. >>[Student] Oh, okay.

1217
01:15:50,370 --> 01:15:51,890
[Joseph] So, I have a pointer minus minus, right,

1218
01:15:51,890 --> 01:15:54,140
which moves the thing back one, and then I free it,

1219
01:15:54,140 --> 01:15:57,000
because this pointer has to be pointed to the beginning of the array.

1220
01:15:57,000 --> 01:16:00,420
[Student] But that wouldn't be needed had you stopped after that line.

1221
01:16:00,420 --> 01:16:03,130
[Joseph] So, if I had stopped after this, this would be considered a memory leak,

1222
01:16:03,130 --> 01:16:04,810
because I didn't run the free.

1223
01:16:04,810 --> 01:16:11,290
[Student] I [unintelligible] after the first three lines where you had pointer +1 [unintelligible].

1224
01:16:11,290 --> 01:16:13,140
[Joseph] Uh-huh. So, what's the question there?

1225
01:16:13,140 --> 01:16:14,780
Sorry. No, no. Go, go, please.

1226
01:16:14,780 --> 01:16:16,870
[Student] So, you're not changing the value of pointers.

1227
01:16:16,870 --> 01:16:19,130
You wouldn't have had to do pointer minus minus.

1228
01:16:19,130 --> 01:16:19,730
[Joseph] Yes, exactly.

1229
01:16:19,730 --> 01:16:21,890
So, when I do pointer +1 and pointer +2, 

1230
01:16:21,890 --> 01:16:24,410
I'm not doing pointer equals pointer +1.

1231
01:16:24,410 --> 01:16:27,260
So, the pointer just stays pointing at the beginning of the array.

1232
01:16:27,260 --> 01:16:31,460
It's only when I do plus plus that it sets the value back inside the pointer,

1233
01:16:31,460 --> 01:16:33,550
that it actually moves this along.

1234
01:16:36,860 --> 01:16:37,780
All right.

1235
01:16:40,550 --> 01:16:42,030
More questions?

1236
01:16:44,680 --> 01:16:47,790
>> Again, if this is sort of overwhelming, this will be covered in session.

1237
01:16:47,790 --> 01:16:50,710
Ask your teaching fellow about it, and we can answer questions at the end.

1238
01:16:53,510 --> 01:16:56,600
And usually we don't like to do this minus thing.

1239
01:16:56,600 --> 01:16:59,760
This has to require me keeping track of how much I've offset in the array.

1240
01:16:59,760 --> 01:17:04,520
So, in general, this is just to explain how pointer arithmetic works.

1241
01:17:04,520 --> 01:17:07,970
But what we usually like to do is we like to create a copy of the pointer,

1242
01:17:07,970 --> 01:17:11,640
and then we'll use that copy when we're moving around in the string.

1243
01:17:11,640 --> 01:17:14,660
So, in these case you use the copy to print the entire string,

1244
01:17:14,660 --> 01:17:19,040
but we don't have to do like pointer minus 6 or keep track of how much we moved in this,

1245
01:17:19,040 --> 01:17:22,700
just because we know that our original point is still pointed to the beginning of the list

1246
01:17:22,700 --> 01:17:25,340
and all that we altered was this copy.

1247
01:17:25,340 --> 01:17:28,250
So, in general, alter copies of your original pointer.

1248
01:17:28,250 --> 01:17:32,350
Don't try to sort of like--don't alter original copies.

1249
01:17:32,350 --> 01:17:35,290
Trying to alter only copies of your original.

1250
01:17:41,540 --> 01:17:44,870
So, you notice when we pass the string into printf

1251
01:17:44,870 --> 01:17:48,990
you don't have to put a star in front of it like we did with all the other dereferences, right?

1252
01:17:48,990 --> 01:17:54,180
So, if you print out the entire string %s expects is an address,

1253
01:17:54,180 --> 01:17:57,610
and in this case a pointer or in this case like an array of characters.

1254
01:17:57,610 --> 01:18:00,330
>> Characters, char*s, and arrays are the same thing.

1255
01:18:00,330 --> 01:18:03,690
Pointer is to characters, and character arrays are the same thing.

1256
01:18:03,690 --> 01:18:05,720
And so, all we have to do is pass in pointer.

1257
01:18:05,720 --> 01:18:08,150
We don't have to pass in like *pointer or anything like that.

1258
01:18:13,110 --> 01:18:14,930
So, arrays and pointers are the same thing.

1259
01:18:14,930 --> 01:18:19,160
When you're doing something like x [ y ] over here for an array,

1260
01:18:19,160 --> 01:18:21,960
what it's doing under the hood is it's saying, okay, it's a character array,

1261
01:18:21,960 --> 01:18:23,690
so it's a pointer.

1262
01:18:23,690 --> 01:18:26,510
And so x are the same thing,

1263
01:18:26,510 --> 01:18:28,650
and so what it does is it adds y to x, 

1264
01:18:28,650 --> 01:18:31,820
which is the same thing as moving forward in memory that much.

1265
01:18:31,820 --> 01:18:34,930
And now x + y gives us some sort of address,

1266
01:18:34,930 --> 01:18:37,570
and we dereference the address or follow the arrow

1267
01:18:37,570 --> 01:18:41,640
to where that location in memory is and we get the value out of that location in memory.

1268
01:18:41,640 --> 01:18:43,720
So, so these two are exactly the same thing.

1269
01:18:43,720 --> 01:18:45,840
It's just a syntactic sugar.

1270
01:18:45,840 --> 01:18:48,090
They do the same thing. They're just different syntactics for each other.

1271
01:18:51,500 --> 01:18:57,590
>> So, what can go wrong with pointers? Like, a lot. Okay. So, bad things.

1272
01:18:57,590 --> 01:19:02,410
Some bad things you can do are not checking if your malloc call returns null, right?

1273
01:19:02,410 --> 01:19:06,560
In this case, I'm asking the system to give me--what is that number?

1274
01:19:06,560 --> 01:19:11,200
Like 2 billion times 4, because the size of an integer is 4 bytes.

1275
01:19:11,200 --> 01:19:13,810
I'm asking it for like 8 billion bytes.

1276
01:19:13,810 --> 01:19:17,270
Of course my computer is not going to be able to give me that much memory back.

1277
01:19:17,270 --> 01:19:20,960
And we didn't check if this is null, so when we try to dereference it over there--

1278
01:19:20,960 --> 01:19:24,270
follow the arrow to where it's going to--we don't have that memory.

1279
01:19:24,270 --> 01:19:27,150
This is what we call dereferencing a null pointer.

1280
01:19:27,150 --> 01:19:29,710
And this essentially causes you to segfault. 

1281
01:19:29,710 --> 01:19:31,790
This is one of the ways you can segfault.

1282
01:19:34,090 --> 01:19:38,090
Other bad things you can do--oh well. 

1283
01:19:38,090 --> 01:19:40,650
That was dereferencing a null pointer. Okay.

1284
01:19:40,650 --> 01:19:45,160
Other bad things--well, to fix that you just put a check in there

1285
01:19:45,160 --> 01:19:46,980
that checks whether the pointer is null

1286
01:19:46,980 --> 01:19:51,000
and exit out of the program if it happens that malloc returns a null pointer.

1287
01:19:55,110 --> 01:19:59,850
That's the xkcd comic. People understand it now. Sort of.

1288
01:20:06,120 --> 01:20:09,350
>> So, memory. And I went over this.

1289
01:20:09,350 --> 01:20:12,000
We're calling malloc in a loop, but every time we call malloc

1290
01:20:12,000 --> 01:20:14,370
we're losing track of where this pointer is pointing to,

1291
01:20:14,370 --> 01:20:15,750
because we're clobbering it.

1292
01:20:15,750 --> 01:20:18,410
So, the initial call to malloc gives me memory over here.

1293
01:20:18,410 --> 01:20:19,990
My pointer pointers to this.

1294
01:20:19,990 --> 01:20:23,020
Now, I don't free it, so now I call malloc again.

1295
01:20:23,020 --> 01:20:26,070
Now it points over here. Now my memory is pointing over here.

1296
01:20:26,070 --> 01:20:27,640
Pointing over here. Pointing over here.

1297
01:20:27,640 --> 01:20:31,820
But I've lost track of the addresses of all the memory over here that I allocated.

1298
01:20:31,820 --> 01:20:35,100
And so now I don't have any reference to them anymore.

1299
01:20:35,100 --> 01:20:37,230
So, I can't free them outside of this loop.

1300
01:20:37,230 --> 01:20:39,390
And so in order to fix something like this,

1301
01:20:39,390 --> 01:20:42,250
if you forget to free memory and you get this memory leak,

1302
01:20:42,250 --> 01:20:45,810
You have to free the memory inside of this loop once you're done with it.

1303
01:20:45,810 --> 01:20:51,400
Well, this is what happens. I know lots of you hate this.

1304
01:20:51,400 --> 01:20:55,270
But now--yay! You get like 44,000 kilobytes.

1305
01:20:55,270 --> 01:20:57,110
So, you free it at the end of the loop,

1306
01:20:57,110 --> 01:20:59,770
and that's going to just free the memory every time.

1307
01:20:59,770 --> 01:21:03,620
Essentially, your program doesn't have a memory leak anymore.

1308
01:21:03,620 --> 01:21:08,150
>> And now something else you can do is free some memory that you've asked for twice.

1309
01:21:08,150 --> 01:21:11,060
In this case, you malloc something, you change its value.

1310
01:21:11,060 --> 01:21:13,140
You free it once because you said you were done with it.

1311
01:21:13,140 --> 01:21:14,940
But then we freed it again.

1312
01:21:14,940 --> 01:21:16,730
This is something that's pretty bad.

1313
01:21:16,730 --> 01:21:18,820
It's not going to initially segfault,

1314
01:21:18,820 --> 01:21:23,350
but after a while what this does is double freeing this corrupts your heap structure,

1315
01:21:23,350 --> 01:21:27,200
and you'll learn a little bit more about this if you choose to take a class like CS61.

1316
01:21:27,200 --> 01:21:30,000
But essentially after a while your computer is going to get confused

1317
01:21:30,000 --> 01:21:33,010
about what memory locations are where and where it's stored--

1318
01:21:33,010 --> 01:21:34,800
where data is stored in memory.

1319
01:21:34,800 --> 01:21:38,080
And so freeing a pointer twice is a bad thing that you don't want to do.

1320
01:21:38,080 --> 01:21:41,600
>> Other things that can go wrong is not using sizeof.

1321
01:21:41,600 --> 01:21:44,460
So, in this case you malloc 8 bytes,

1322
01:21:44,460 --> 01:21:46,700
and that's the same thing as two integers, right?

1323
01:21:46,700 --> 01:21:49,580
So, that's perfectly safe, but is it?

1324
01:21:49,580 --> 01:21:52,160
Well, as Lucas talked about on different architectures,

1325
01:21:52,160 --> 01:21:54,220
integers are of different lengths.

1326
01:21:54,220 --> 01:21:57,970
So, on the appliance that you're using, integers are about 4 bytes,

1327
01:21:57,970 --> 01:22:02,370
but on some other system they might be 8 bytes or they might be 16 bytes.

1328
01:22:02,370 --> 01:22:05,680
So, if I just use this number over here,

1329
01:22:05,680 --> 01:22:07,310
this program might work on the appliance, 

1330
01:22:07,310 --> 01:22:10,360
but it's not going to allocate enough memory on some other system.

1331
01:22:10,360 --> 01:22:14,020
In this case, this is what the sizeof operator is used for.

1332
01:22:14,020 --> 01:22:16,880
When we call sizeof(int), what this does is

1333
01:22:16,880 --> 01:22:21,910
 it gives us the size of an integer on the system that the program is running.

1334
01:22:21,910 --> 01:22:25,490
So, in this case, sizeof(int) will return 4 on something like the appliance,

1335
01:22:25,490 --> 01:22:29,980
and now this will 4 * 2, which is 8,

1336
01:22:29,980 --> 01:22:32,330
which is just the amount of space necessary for two integers.

1337
01:22:32,330 --> 01:22:36,710
On a different system, if an int is like 16 bytes or 8 bytes,

1338
01:22:36,710 --> 01:22:39,380
it's just going to return enough bytes to store that amount.

1339
01:22:41,830 --> 01:22:45,310
>> And finally, structs.

1340
01:22:45,310 --> 01:22:48,340
So, if you wanted to store a sudoku board in memory, how might we do this?

1341
01:22:48,340 --> 01:22:51,570
You might think of like a variable for the first thing,

1342
01:22:51,570 --> 01:22:53,820
a variable for the second thing, a variable for the third thing,

1343
01:22:53,820 --> 01:22:56,420
a variable for the fourth thing--bad, right?

1344
01:22:56,420 --> 01:23:00,750
So, one improvement you can make on top of this is to make a 9 x 9 array.

1345
01:23:00,750 --> 01:23:04,480
That's fine, but what if you wanted to associate other things with the sudoku board

1346
01:23:04,480 --> 01:23:06,490
like what the difficulty of the board is,

1347
01:23:06,490 --> 01:23:11,740
or, for example, what your score is, or how much time it's taken you to solve this board?

1348
01:23:11,740 --> 01:23:14,970
Well, what you can do is you can create a struct.

1349
01:23:14,970 --> 01:23:18,910
What I'm basically saying is I'm defining this structure over here,

1350
01:23:18,910 --> 01:23:23,230
and I'm defining a sudoku board which consists of a board that is 9 x 9.

1351
01:23:23,230 --> 01:23:26,650
>> And what it has it has pointers to the name of the level.

1352
01:23:26,650 --> 01:23:30,730
It also has x and y, which are the coordinates of where I am right now.

1353
01:23:30,730 --> 01:23:35,980
It also has time spent [unintelligible], and it has the total number of moves I've inputted so far.

1354
01:23:35,980 --> 01:23:40,010
And so in this case, I can group a whole bunch of data into just one structure

1355
01:23:40,010 --> 01:23:42,790
instead of having it like flying around in like different variables 

1356
01:23:42,790 --> 01:23:44,540
that I can't really keep track of.

1357
01:23:44,540 --> 01:23:49,720
And this lets us have just nice syntax for sort of referencing different things inside of this struct.

1358
01:23:49,720 --> 01:23:53,430
I can just do board.board, and I get the sudoku board back.

1359
01:23:53,430 --> 01:23:56,320
Board.level, I get how tough it is.

1360
01:23:56,320 --> 01:24:00,540
Board.x and board.y give me the coordinates of where I might be in the board.

1361
01:24:00,540 --> 01:24:04,730
And so I'm accessing what we call fields in the struct.

1362
01:24:04,730 --> 01:24:08,840
This defines sudokuBoard, which is a  type that I have.

1363
01:24:08,840 --> 01:24:14,800
And now we're here. I have a variable called "board" of type sudokuBoard.

1364
01:24:14,800 --> 01:24:18,820
And so now I can access all the fields that make up this structure over here.

1365
01:24:20,830 --> 01:24:22,450
>> Any questions about structs? Yes?

1366
01:24:22,450 --> 01:24:25,890
[Student] For int x, y, you declared both on one line? >>[Joseph] Uh-huh.

1367
01:24:25,890 --> 01:24:27,400
[Student] So, could you just do that with all of them?

1368
01:24:27,400 --> 01:24:31,200
Like in x, y comma times that total?

1369
01:24:31,200 --> 01:24:34,460
[Joseph] Yes, you could definitely do that, but the reason I put x and y on the same line--

1370
01:24:34,460 --> 01:24:36,330
and the question is why can we just do this on the same line?

1371
01:24:36,330 --> 01:24:38,600
Why don't we just put all of these on the same line is

1372
01:24:38,600 --> 01:24:42,090
x and y are related to each other,

1373
01:24:42,090 --> 01:24:44,780
and this is just stylistically more correct, in a sense,

1374
01:24:44,780 --> 01:24:46,600
because it's grouping two things on the same line

1375
01:24:46,600 --> 01:24:49,340
that like sort of relate to the same thing.

1376
01:24:49,340 --> 01:24:51,440
And I just split these apart. It's just a style thing.

1377
01:24:51,440 --> 01:24:53,720
It functionally makes no difference whatsoever.

1378
01:24:58,150 --> 01:24:59,270
Any other questions on structs?

1379
01:25:03,030 --> 01:25:06,620
You can define a Pokédex with a struct.

1380
01:25:06,620 --> 01:25:11,720
A Pokémon has a number and it has a letter, an owner, a type.

1381
01:25:11,720 --> 01:25:16,990
And then if you have an array of Pokémon , you can make up a Pokédex, right?

1382
01:25:16,990 --> 01:25:20,810
Okay, cool. So, questions on structs. Those are related to structs.

1383
01:25:20,810 --> 01:25:25,270
>> Finally, GDB. What does GDB let you do? It lets you debug your program.

1384
01:25:25,270 --> 01:25:27,650
And if you haven't used GDB, I would recommended watching the short

1385
01:25:27,650 --> 01:25:31,250
and just going over what GDB is, how you work with it, how you might use it, 

1386
01:25:31,250 --> 01:25:32,900
and test it on a program.

1387
01:25:32,900 --> 01:25:37,400
And so what GDB lets you do is it lets pause the [unintelligible] up your program

1388
01:25:37,400 --> 01:25:38,920
and a practical line.

1389
01:25:38,920 --> 01:25:42,600
For example, I want to pause execution at like line 3 of my program,

1390
01:25:42,600 --> 01:25:46,010
and while I'm at line 3 I can print out all the values that are there.

1391
01:25:46,010 --> 01:25:49,710
And so what we call like pausing in a line 

1392
01:25:49,710 --> 01:25:52,350
is we call this putting a breakpoint at that line

1393
01:25:52,350 --> 01:25:55,920
and then we can print out the variables at the state of the program at that time.

1394
01:25:55,920 --> 01:25:58,990
>> We can then from there step through the program line-by-line.

1395
01:25:58,990 --> 01:26:03,200
And then we can look at the state of the stack at the time.

1396
01:26:03,200 --> 01:26:08,600
And so in order to use GDB, what we do is we call clang on the C file,

1397
01:26:08,600 --> 01:26:11,290
but we have to pass it the -ggdb flag.

1398
01:26:11,290 --> 01:26:15,850
And once we're done with that we just run gdb on the resulting output file.

1399
01:26:15,850 --> 01:26:18,810
And so you get some like mass of text like this, 

1400
01:26:18,810 --> 01:26:21,990
but really all you have to do is type in commands at the beginning.

1401
01:26:21,990 --> 01:26:24,250
Break main puts a breakpoint at main.

1402
01:26:24,250 --> 01:26:28,470
List 400 lists the lines of code around line 400.

1403
01:26:28,470 --> 01:26:31,410
And so in this case you can just look around and say, oh,

1404
01:26:31,410 --> 01:26:34,360
I want to set a breakpoint at line 397, which is this line,

1405
01:26:34,360 --> 01:26:37,170
and then your program runs into that step and it's going to break.

1406
01:26:37,170 --> 01:26:41,120
It's going to pause there, and you can print out, for example, value of low or high.

1407
01:26:41,120 --> 01:26:46,410
And so there are a bunch of commands you need to know,

1408
01:26:46,410 --> 01:26:48,660
and this slideshow will go up on the website,

1409
01:26:48,660 --> 01:26:54,000
so if you just want to reference these or like put them on your cheat sheets, feel free.

1410
01:26:54,000 --> 01:27:00,650
>> Cool. That was Quiz Review 0, and we'll stick around if you have any questions.

1411
01:27:00,650 --> 01:27:03,850
All right. 

1412
01:27:03,850 --> 01:27:09,030
>>  [applause]

1413
01:27:09,030 --> 01:27:13,000
>> [CS50.TV]