1
00:00:00,000 --> 00:00:02,670
Section Problem Set 2: Hacker Edition

2
00:00:02,670 --> 00:00:04,910
Rob Bowden, Harvard University

3
00:00:04,910 --> 00:00:07,410
This is CS50.                                                                                                CS50.TV

4
00:00:07,410 --> 00:00:15,770
So, I'm Rob. I'm a senior in Kirkland. This is my third year TFing CS50.

5
00:00:15,770 --> 00:00:22,220
It is the first time that we are changing from the traditional-lecture-style section, 

6
00:00:22,220 --> 00:00:25,610
where we just kind of review what happened in lecture and then you guys ask questions, 

7
00:00:25,610 --> 00:00:32,250
now to being a lot more problem-based, where we use Spaces, and--

8
00:00:32,250 --> 00:00:37,410
Oh, so the idea is to go to that link I sent you and then you'll be in my Space. 

9
00:00:37,410 --> 00:00:42,410
Does anyone not have a laptop? Okay. 

10
00:00:42,410 --> 00:00:47,050
So we're going to be using this, and we're going to be doing problems live in section

11
00:00:47,050 --> 00:00:50,740
and discussing them and figuring out what's wrong 

12
00:00:50,740 --> 00:00:56,390
and I might pull up some of your code, and I might discuss your ideas. 

13
00:00:56,390 --> 00:01:02,140
So has anyone had difficulty?

14
00:01:02,140 --> 00:01:07,000
You can chat on the side; I don't know if we'll have reason for that. 

15
00:01:07,000 --> 00:01:12,270
Now, like the previous supersection, if you were at that class, you know what that's about. 

16
00:01:12,270 --> 00:01:19,200
On all of P sets there's going to be these sections. 

17
00:01:19,200 --> 00:01:22,550
So P-set 2, specifications,  I guess you saw it on P-set 1 already. 

18
00:01:22,550 --> 00:01:27,400
But we can look at P-set 2 for what we're going to be going over today. 

19
00:01:27,400 --> 00:01:29,460
And you'll see a section of questions. 

20
00:01:29,460 --> 00:01:37,530
So this will be in all of the P-sets; there'll be a section of questions. 

21
00:01:37,530 --> 00:01:41,340
So far we've said, "Consider this an opportunity to practice."

22
00:01:41,340 --> 00:01:44,940
You won't be asked to submit this program. 

23
00:01:44,940 --> 00:01:48,480
The idea is that these are supposed to kind of help you get started with the problem set. 

24
00:01:48,480 --> 00:01:53,220
I guess on the Hacker edition, a lot of them are supposed to just be new, interesting things to learn. 

25
00:01:53,220 --> 00:01:58,590
They may not be directly applicable to the problem set. 

26
00:01:58,590 --> 00:02:01,810
And right now we're not having you submit them, but in theory, 

27
00:02:01,810 --> 00:02:07,480
for later problem sets, you might submit them, and thus you can either come to section

28
00:02:07,480 --> 00:02:10,380
or watch the section to get the answers, or you can just get them on your own

29
00:02:10,380 --> 00:02:16,350
if you don't feel like enjoying my presence. 

30
00:02:16,350 --> 00:02:21,010
So the--I think this is the first one. 

31
00:02:21,010 --> 00:02:29,280
Oh. Also, under these sections of questions we also have you ask questions about the shorts. 

32
00:02:29,280 --> 00:02:33,440
So I guess, in theory, you're supposed to watch these before coming to section, 

33
00:02:33,440 --> 00:02:38,550
but it's fine if you don't; we'll go over them anyway. 

34
00:02:38,550 --> 00:02:42,590
So we can start with these: "How does a while loop differ from a do-while loop?

35
00:02:42,590 --> 00:02:46,210
When is the latter particularly useful?"

36
00:02:46,210 --> 00:02:49,390
So anyone have any--?

37
00:02:49,390 --> 00:02:52,730
[Student] The do-while loop will always execute at least once.

38
00:02:52,730 --> 00:03:02,950
Yes. So that is the difference. A while loop--I'll just do it on here--while loop, we have the condition

39
00:03:02,950 --> 00:03:19,760
right here, whereas a do-while, you don't have a condition until we get down here. 

40
00:03:19,760 --> 00:03:24,130
And so, when your program's executing, and it gets to the while loop, 

41
00:03:24,130 --> 00:03:26,380
it immediately checks if this condition is true. 

42
00:03:26,380 --> 00:03:30,710
If that condition is not true, it will just skip over the loop entirely. 

43
00:03:30,710 --> 00:03:34,390
Do-while loop, as the program is executing, it gets to the "do."

44
00:03:34,390 --> 00:03:37,920
Nothing happens at this point, just continues executing. 

45
00:03:37,920 --> 00:03:42,690
Then when it hits the "while," if the condition is true, it'll loop back and do it again

46
00:03:42,690 --> 00:03:46,730
and again and again until the condition is not true and then just falls through. 

47
00:03:46,730 --> 00:03:50,600
So, the difference being, that this can skip right from the very start. 

48
00:03:50,600 --> 00:03:56,770
This necessarily executes once and then may execute more times if the condition is still true.

49
00:03:56,770 --> 00:04:03,720
So the while loop will only do it once, or--the while loop--we may not need to do it at all, 

50
00:04:03,720 --> 00:04:07,900
since as soon as we get to it, if the condition is false, we'll just skip right over it.

51
00:04:07,900 --> 00:04:11,770
Whereas do-while loop, we will execute it once, necessarily.

52
00:04:11,770 --> 00:04:14,560
Then, when we get to the condition, we check if it's true or false. 

53
00:04:14,560 --> 00:04:19,790
If it's true, we'll do it again; if it's false, we'll just continue going. 

54
00:04:19,790 --> 00:04:24,680
So when is the latter particularly useful? 

55
00:04:24,680 --> 00:04:31,190
So I can say that in the entirety of the 4 years, 3 years, whatever, 

56
00:04:31,190 --> 00:04:38,780
that I've been programming, I have used this, like, under 10 times. 

57
00:04:38,780 --> 00:04:43,140
And probably 5 of them are in CS50 when we're introducing do-while loops. 

58
00:04:43,140 --> 00:04:47,510
So when do you used do-while loops? 

59
00:04:47,510 --> 00:04:49,510
When is the--yeah?

60
00:04:49,510 --> 00:04:53,180
[Student] When you're trying to get user input, or something you want to check--

61
00:04:53,180 --> 00:04:59,700
Yeah. So do-while loops, user input is the big one.

62
00:04:59,700 --> 00:05:03,160
That's why on the first couple problem sets, when you want to ask the user, like, 

63
00:05:03,160 --> 00:05:08,520
"give me a string," you can't continue until you get that string. 

64
00:05:08,520 --> 00:05:12,980
And so you, necessarily, need to ask for the string at least once. 

65
00:05:12,980 --> 00:05:16,950
But then if they answer something bad, then you need to loop back and ask again. 

66
00:05:16,950 --> 00:05:20,810
But other than user input, it's very rare that I encounter a case 

67
00:05:20,810 --> 00:05:27,170
where I want to loop "at least once" but possibly more. 

68
00:05:27,170 --> 00:05:33,370
Questions or--? Has anyone used a do-while loop anywhere else? 

69
00:05:33,370 --> 00:05:36,780
Okay. So the next one is, "What does undeclared identifier 

70
00:05:36,780 --> 00:05:43,310
usually indicate if outputted by clang?"

71
00:05:43,310 --> 00:05:47,380
So what kind of code could I write to get 'undeclared identifier?' 

72
00:05:47,380 --> 00:05:49,550
[Student] That x = 2? 

73
00:05:49,550 --> 00:05:52,650
So we can just try it in here,  x = 2. 

74
00:05:52,650 --> 00:06:04,830
We'll run this--oh, I didn't click it. So here we get--all right. 

75
00:06:04,830 --> 00:06:07,100
"Use of undeclared identifier x." 

76
00:06:07,100 --> 00:06:11,610
So that's the undeclared identifier, a variable. 

77
00:06:11,610 --> 00:06:13,910
It will frequently call a variable an identifier.

78
00:06:13,910 --> 00:06:17,300
So it might not know it's actually a variable; it doesn't know what it is. 

79
00:06:17,300 --> 00:06:19,380
So it's an identifier. 

80
00:06:19,380 --> 00:06:26,060
So why is it undeclared? Yeah. 

81
00:06:26,060 --> 00:06:32,190
So to be clear on terminology, the declaration of a variable

82
00:06:32,190 --> 00:06:37,360
is when you say "int x," or "string y," whatever. 

83
00:06:37,360 --> 00:06:41,910
The initialization of the variable, or the assignment of the variable, 

84
00:06:41,910 --> 00:06:44,510
is whenever you say "x = 2." 

85
00:06:44,510 --> 00:06:52,950
So we can do these in separate steps, int x, x = 2, and until--we can have a bunch of stuff in here--

86
00:06:52,950 --> 00:07:00,350
but until this line happens, x is still uninitialized, but it has been declared. 

87
00:07:00,350 --> 00:07:06,760
And so we can obviously do it in 1 line, and now we are declaring and initializing. 

88
00:07:06,760 --> 00:07:10,730
Questions? 

89
00:07:10,730 --> 00:07:18,390
And finally, "Why is the Caesar Cipher not very secure?"

90
00:07:18,390 --> 00:07:23,830
So first, does anyone want to say what the Caesar Cipher is?

91
00:07:23,830 --> 00:07:28,100
[Student] Caesar Cipher just is that you map, you shift every letter,

92
00:07:28,100 --> 00:07:34,420
a certain number of letters go over, and move back over, and it's not very secure because

93
00:07:34,420 --> 00:07:42,260
there's only 26 possible options and you just have to try each 1 of those until you get it.

94
00:07:42,260 --> 00:07:45,470
Oh. So, I should repeat? 

95
00:07:45,470 --> 00:07:51,600
The Caesar Cipher, it's--I mean, you'll be dealing with it on the problems that you--

96
00:07:51,600 --> 00:07:56,110
or I guess the standard edition of the problem set that's not on the hacker edition. 

97
00:07:56,110 --> 00:08:01,550
So on the standard edition to the problem set, you get a message like, "Hello, world, "

98
00:08:01,550 --> 00:08:08,410
and you also have a number like 6, and you take that message, and each individual character, 

99
00:08:08,410 --> 00:08:11,310
you rotate it by 6 positions in the alphabet. 

100
00:08:11,310 --> 00:08:16,560
So the 'h' in hello would become h-i-j-k-l-m-n. 

101
00:08:16,560 --> 00:08:19,600
So the first letter would be n. We do the same thing with e. 

102
00:08:19,600 --> 00:08:23,530
If we have a, like, z or something, then we wrap back around to 'a.'

103
00:08:23,530 --> 00:08:29,280
But each character gets cycled 6 characters later in the alphabet, and it's not very secure

104
00:08:29,280 --> 00:08:35,440
since there are only 26 possibilities for how many ways you could wrap a single letter. 

105
00:08:35,440 --> 00:08:42,919
So you can just try all 26 of them and, presumably, for a long enough message, 

106
00:08:42,919 --> 00:08:46,860
only 1 of those possible 26 things is going to be legible, 

107
00:08:46,860 --> 00:08:50,300
and the legible one is going to be the original message. 

108
00:08:50,300 --> 00:08:56,240
So it's not a very good way of encrypting anything at all. 

109
00:08:56,240 --> 00:08:59,070
Unrelated to those shorts, "What is a function?"

110
00:08:59,070 --> 00:09:03,370
So what is a function? Yes. 

111
00:09:03,370 --> 00:09:11,640
[Student] It's like a separate piece of code that you can call to go through and then get the return value of whatever. 

112
00:09:11,640 --> 00:09:18,160
Yeah. So I'll answer it by also answering the next--or repeat by also just answering the next one.

113
00:09:18,160 --> 00:09:22,410
You can use functions instead of just copying and pasting code over and over again.

114
00:09:22,410 --> 00:09:27,200
Just take that code, put it into a fuction, and then you could just call the function 

115
00:09:27,200 --> 00:09:29,870
wherever you have been copying and pasting. 

116
00:09:29,870 --> 00:09:33,350
So functions are useful. 

117
00:09:33,350 --> 00:09:35,860
So now we'll do actual problems. 

118
00:09:35,860 --> 00:09:46,490
The first one. So the idea of the first one is, you pass it a string, and regardless of the--

119
00:09:46,490 --> 00:09:52,060
or does it say all lowercase? It does not say all lowercase. 

120
00:09:52,060 --> 00:09:57,730
So the message can be anything, and--oh no. It does. 

121
00:09:57,730 --> 00:10:01,610
"For simplicity, you may assume that the user will only input lowercase letters and spaces."

122
00:10:01,610 --> 00:10:08,180
So we pass it a message with only lowercase letters and then we alternate

123
00:10:08,180 --> 00:10:15,450
between capital and lowercase--we change the string to be capital and lowercase, alternating. 

124
00:10:15,450 --> 00:10:22,920
So before we give you a second to even dive into the problem, 

125
00:10:22,920 --> 00:10:32,420
what is the first thing that we need to do?

126
00:10:32,420 --> 00:10:36,900
Oh, what did I just click on? Oh, I just clicked on an email in here. 

127
00:10:36,900 --> 00:10:42,870
So the first thing we need to do--am I looking at the wrong one? 

128
00:10:42,870 --> 00:10:49,320
Is this part of this one?

129
00:10:49,320 --> 00:10:51,320
No, those are still in there, though. 

130
00:10:51,320 --> 00:10:55,160
Okay, still here. 

131
00:10:55,160 --> 00:11:03,160
Now we can not assume--? Yes. Here we cannot assume that it's only lowercase and spaces. 

132
00:11:03,160 --> 00:11:07,770
So now we have to deal with the fact that the letters can be whatever we want them to be. 

133
00:11:07,770 --> 00:11:11,910
And so the first thing we want to do is just get the message. 

134
00:11:11,910 --> 00:11:19,790
We just need to get a string, string s = GetString, okay. 

135
00:11:19,790 --> 00:11:24,890
Now this problem, there are a couple of ways of doing it. 

136
00:11:24,890 --> 00:11:29,840
But we are going to want to use bitwise operators here.

137
00:11:29,840 --> 00:11:35,280
Are there people who either were not at the supersection,

138
00:11:35,280 --> 00:11:37,480
or something, and don't know what bitwise operators are?

139
00:11:37,480 --> 00:11:41,710
Or how they relate to ASCII in any way?

140
00:11:41,710 --> 00:11:45,650
[Student] I wasn't at the supersection, but I know what bitwise operators are. 

141
00:11:45,650 --> 00:11:49,560
Okay. So then I don't have to go over the basics of them, but I'll explain 

142
00:11:49,560 --> 00:11:51,830
what we're going to want to use here. 

143
00:11:51,830 --> 00:11:59,680
So 'A': Binary representation of capital A, the number is 65. 

144
00:11:59,680 --> 00:12:07,560
I'm just going to look at--41 is going to be 01000001. 

145
00:12:07,560 --> 00:12:14,170
So that should be 65 in decimal; so this is the binary representation of the character capital A.

146
00:12:14,170 --> 00:12:19,440
Now, the binary representation of the character lowercase 'a'  

147
00:12:19,440 --> 00:12:33,350
is going to be the same thing, almost. Is that--6, yeah. This is right. 

148
00:12:33,350 --> 00:12:37,670
So binary capital A, binary lowercase 'a.'

149
00:12:37,670 --> 00:12:43,940
So notice that the difference between A and 'a' is this single bit. 

150
00:12:43,940 --> 00:12:49,440
And this happens to be the 32 bit, the bit representing the number 32. 

151
00:12:49,440 --> 00:12:53,910
And that makes sense since A is 65; 'a' is 97.

152
00:12:53,910 --> 00:12:56,610
The difference between them is 32. 

153
00:12:56,610 --> 00:13:03,770
So now we know we can convert from A to 'a' by taking A

154
00:13:03,770 --> 00:13:09,710
and bitwise ORing it, with--that looks like a 1. 

155
00:13:09,710 --> 00:13:20,900
This is a bitwise OR, with 00100000, and that'll give us 'a.' 

156
00:13:20,900 --> 00:13:26,850
And we can get from 'a' to A by bitwise ANDing

157
00:13:26,850 --> 00:13:33,700
with 11, 0 in that place, 11111. 

158
00:13:33,700 --> 00:13:43,840
So this will then give us exactly what 'a' was; but cancel out this individual bit, 

159
00:13:43,840 --> 00:13:50,070
so we'll have 01000001; I don't know if I counted right. 

160
00:13:50,070 --> 00:13:56,750
But this technique of bitwise ORing to get from capital to lowercase, 

161
00:13:56,750 --> 00:14:02,080
and bitwise ANDing to get from lowercase to capital isn't exclusive to A.

162
00:14:02,080 --> 00:14:06,510
All of the letters, K vs k, Z vs z,

163
00:14:06,510 --> 00:14:10,080
all of them are just going to differ by this single bit. 

164
00:14:10,080 --> 00:14:16,290
And so you can use this to change from any lowercase letter to any capital letter and vice versa. 

165
00:14:16,290 --> 00:14:26,670
Okay. So an easy way of getting from this--so instead of having to

166
00:14:26,670 --> 00:14:32,170
write out whatever 1011111 is--an easy way of representing this number, and this isn't one

167
00:14:32,170 --> 00:14:39,710
that I went over in the supersection, but tilde (~) is another bitwise operator.

168
00:14:39,710 --> 00:14:42,520
What ~ does is it looks at the bit representation. 

169
00:14:42,520 --> 00:14:45,630
Let's take any number. 

170
00:14:45,630 --> 00:14:53,130
This is just some binary number, and what ~ does is it just flips all of the bits. 

171
00:14:53,130 --> 00:15:00,630
So this was a 1, now a 0, this is a 0, now a 1, 010100. 

172
00:15:00,630 --> 00:15:08,320
So that's all ~ does. So 32 is going to be the number--get rid of that--

173
00:15:08,320 --> 00:15:23,320
so 32 is going to be the number 00100000, and so ~ of this is going to be 

174
00:15:23,320 --> 00:15:29,980
this number up here that I ANDed 'a' with. 

175
00:15:29,980 --> 00:15:35,600
Does everyone see that? This is pretty common, like when you want to figure out

176
00:15:35,600 --> 00:15:40,740
for later things that we might be seeing, when we want to see if--

177
00:15:40,740 --> 00:15:44,710
or we want everything, every single bit set except for 1

178
00:15:44,710 --> 00:15:47,910
you tend to do ~ of the bit that we don't want set. 

179
00:15:47,910 --> 00:15:53,090
So we don't want the 32 bit set, so we do ~ of 32. 

180
00:15:53,090 --> 00:15:57,790
Okay. So we can use all of those here. 

181
00:15:57,790 --> 00:16:03,000
All right, so it's fine if you're not done, we shall slowly walk over together, 

182
00:16:03,000 --> 00:16:11,870
or walk over this, so--through this. Walk through this. 

183
00:16:11,870 --> 00:16:20,790
So we have our string, and we want to loop over each character in that string and do something to it. 

184
00:16:20,790 --> 00:16:26,710
So how do we loop over a string? What should we use? 

185
00:16:26,710 --> 00:16:30,980
I'm not going to do it on here. Yeah. 

186
00:16:30,980 --> 00:16:42,940
So I have my iterator, and he said it, but how do I know how many characters are in the string? 

187
00:16:42,940 --> 00:16:47,030
Strlen (s), then i + +. 

188
00:16:47,030 --> 00:16:49,860
So what I've done here isn't the best way of doing things. 

189
00:16:49,860 --> 00:16:51,860
Does anyone know why?

190
00:16:51,860 --> 00:16:55,290
Because you're checking the language of the string every single time. 

191
00:16:55,290 --> 00:17:06,859
So we are going to want to move strlen, I could say up here, int length = strlen (s), 

192
00:17:06,859 --> 00:17:11,900
and then do i < length, and in case you haven't seen it before, 

193
00:17:11,900 --> 00:17:20,410
I could also do int i = 0, length = strlen (s). 

194
00:17:20,410 --> 00:17:25,010
And so this is somewhat preferable, since now I've restricted the scope

195
00:17:25,010 --> 00:17:29,150
of the variable length to just this 'for' loop, instead of declaring it before

196
00:17:29,150 --> 00:17:34,990
and that it always exists, and in case you didn't catch why that's bad, 

197
00:17:34,990 --> 00:17:39,410
or why the original was bad, it's--start at the for loop. 

198
00:17:39,410 --> 00:17:43,380
I checked the condition. Is i < the length of s? 

199
00:17:43,380 --> 00:17:46,790
So the length of s, let's work with "hello" the entire time. 

200
00:17:46,790 --> 00:17:49,670
So length of s, h-e-l-l-o. Length is 5. 

201
00:17:49,670 --> 00:17:57,580
So i = 0, length is 5, so i is not < 5, so the loop continues. 

202
00:17:57,580 --> 00:18:02,750
Then we go again. We check the condition. Is i < the length of hello?

203
00:18:02,750 --> 00:18:08,390
So let's check the length of hello. H-e-l-l-o. That's 5; i is not < 5, so we continue again. 

204
00:18:08,390 --> 00:18:13,330
So we are calculating, we are counting h-e-l-l-o, for each iteration of the loop, 

205
00:18:13,330 --> 00:18:17,380
even thought it's never going to change; it's always going to be 5. 

206
00:18:17,380 --> 00:18:22,530
So we just remember 5 up front, and now everything's better. 

207
00:18:22,530 --> 00:18:24,990
So iterating over the entire string. 

208
00:18:24,990 --> 00:18:31,470
What do we want to do for each character of the string? 

209
00:18:31,470 --> 00:18:38,510
[Student speaking, unintelligible]

210
00:18:38,510 --> 00:18:47,000
Yeah. So, if the character is non-alphabetic, then we just want to skip over it. 

211
00:18:47,000 --> 00:18:52,300
Because we only care about alphabetic letters; we can't capitalize a number. 

212
00:18:52,300 --> 00:19:10,850
So how can we do this? So our condition, so if we want something--check if it's alphabetical. 

213
00:19:10,850 --> 00:19:14,060
So how do we check this? 

214
00:19:14,060 --> 00:19:18,720
[Student] You can just use the function is alpha. 

215
00:19:18,720 --> 00:19:23,160
Is that included in either of these, or any include like, char.h or something?

216
00:19:23,160 --> 00:19:32,710
Let's not use the is alpha function, and use the explicit--so we have s[i],  

217
00:19:32,710 --> 00:19:40,460
that is the eighth character of s, remember that a string is an array of characters, 

218
00:19:40,460 --> 00:19:43,180
so the eighth character of s. 

219
00:19:43,180 --> 00:19:49,280
Now, if it is a capital letter, we know it has to be in a specific range. 

220
00:19:49,280 --> 00:19:54,370
And what is that range? 

221
00:19:54,370 --> 00:20:07,860
Yeah. So if s[i] is ≥ 65, and s[i] is ≤ 90, what should I do instead? 

222
00:20:07,860 --> 00:20:18,470
Yeah. So you should absolutely never even need to know the ASCII values of anything ever. 

223
00:20:18,470 --> 00:20:25,640
Never think of the numbers 65, 90, 97 and 102, or whatever it is. 

224
00:20:25,640 --> 00:20:32,470
You don't need--112?--you don't need to know those at all. That's wrong too. 

225
00:20:32,470 --> 00:20:41,940
Only use the single-quote characters, single quote constants. So 'A' and less than 90 is 'Z.'

226
00:20:41,940 --> 00:20:47,930
And this is significantly better--I would not know off the top of my head that Z is 90. 

227
00:20:47,930 --> 00:20:52,690
I do know off the top of my head that 'Z' is capital Z. 

228
00:20:52,690 --> 00:21:02,100
So as long as this is in the range of capital A to capital Z, or we can check for lowercase, 

229
00:21:02,100 --> 00:21:17,010
Or if it's in the range ≥ 'a' and ≤ z.

230
00:21:17,010 --> 00:21:19,010
So that's our condition. 

231
00:21:19,010 --> 00:21:22,520
The style for where to put these things varies. 

232
00:21:22,520 --> 00:21:29,520
I'll do it like this. 

233
00:21:29,520 --> 00:21:31,520
Now, what do we want to do? 

234
00:21:31,520 --> 00:21:39,530
We know this letter is a character, an alphabetical character. 

235
00:21:39,530 --> 00:21:46,270
So we need to alternate between whether this should now be a capital letter or a lowercase letter. 

236
00:21:46,270 --> 00:21:48,820
How do we keep track of which one we want it to be?

237
00:21:48,820 --> 00:21:55,520
[Student voices, unintelligible]

238
00:21:55,520 --> 00:21:59,150
So yes, but let me check. 

239
00:21:59,150 --> 00:22:04,910
Module 0-2 was said, was a suggestion thrown out, and I agree with that. 

240
00:22:04,910 --> 00:22:11,780
Except notice that, like--is this the case? Yeah. 

241
00:22:11,780 --> 00:22:18,270
It's every other one, but we can't module 2 of i, or i mod 2, since 

242
00:22:18,270 --> 00:22:22,950
notice that E is capital and 'a' is lowercase? But there's a space separating them? 

243
00:22:22,950 --> 00:22:27,150
So they're going to be the same mod 2, but they're different cases. 

244
00:22:27,150 --> 00:22:29,150
[Student question, unintelligible] 

245
00:22:29,150 --> 00:22:34,690
Yeah. So we're just going to keep a count. 

246
00:22:34,690 --> 00:22:38,730
We could also do that in here if we wanted; that might get a little unwieldy

247
00:22:38,730 --> 00:22:41,300
in the for loop declarations; I'll put it up here. 

248
00:22:41,300 --> 00:22:48,840
So int count = starts at 0. 

249
00:22:48,840 --> 00:22:54,070
And so now, I'm going to count how many alphabetical characters we've had. 

250
00:22:54,070 --> 00:22:59,550
So we're inevitably going to count + + since we found another alphabetical character. 

251
00:22:59,550 --> 00:23:09,130
But, so now you're saying if count mod 2. 

252
00:23:09,130 --> 00:23:12,590
So what if count mod 2? Oh. I'll do = = 0 for now. 

253
00:23:12,590 --> 00:23:21,740
We'll also go over that. So if count mod 2 = = 0, then what? 

254
00:23:21,740 --> 00:23:27,830
[Students answer, unintelligible]

255
00:23:27,830 --> 00:23:32,750
So we want it to end up uppercase. 

256
00:23:32,750 --> 00:23:37,520
There are 2 cases; uppercase and lowercase are the 2 cases. 

257
00:23:37,520 --> 00:23:40,990
So if we're in lowercase we need to make it uppercase.

258
00:23:40,990 --> 00:23:43,710
If it's uppercase we don't need to do anything. 

259
00:23:43,710 --> 00:23:50,760
But, is there a way--shouldn't have flipped--

260
00:23:50,760 --> 00:23:54,800
that we don't even need to check whether it's uppercase or lowercase?

261
00:23:54,800 --> 00:24:02,240
What can we do to always make sure that we always end up at uppercase?

262
00:24:02,240 --> 00:24:07,830
So notice what we did for lowercase 'a'; what if we did this same exact thing to uppercase A? 

263
00:24:07,830 --> 00:24:11,900
Does uppercase A change, or does the value change?

264
00:24:11,900 --> 00:24:23,100
Yeah. So any capital letter bitwise ANDed with ~ 32 is going to be that same uppercase character

265
00:24:23,100 --> 00:24:29,220
because for any uppercase character the 32nd bit is not set. 

266
00:24:29,220 --> 00:24:40,920
So if we want to bring the character s[i], we want it to become lowercase or uppercase.

267
00:24:40,920 --> 00:24:46,890
So if it was lowercase, it is now uppercase , if it was uppercase, it's still uppercase, and that's it. 

268
00:24:46,890 --> 00:24:54,290
I said this in the supersection: You can use 32 if you want, but I tend to prefer doing 'a' - A, 

269
00:24:54,290 --> 00:25:01,150
instead of just plain 32, because it can be any other bit. 

270
00:25:01,150 --> 00:25:03,610
After the 32 bit, it can be any of these, or we wouldn't have enough 

271
00:25:03,610 --> 00:25:05,840
numbers to represent all of the characters. 

272
00:25:05,840 --> 00:25:09,110
So if you get the 32 bit, it could be the 64 bit, it could be the 128 bit. 

273
00:25:09,110 --> 00:25:13,990
Any of those bits could be the bit that distinguishes between uppercase and lowercase. 

274
00:25:13,990 --> 00:25:18,350
I shouldn't need to know that it's the 32 bit. 

275
00:25:18,350 --> 00:25:27,130
I can use this 'a' - A to get the bit that differs between the two

276
00:25:27,130 --> 00:25:33,000
without needing to rely on the magic number that is 32. 

277
00:25:33,000 --> 00:25:38,770
And so now, else count was odd, and so what do I want to do? 

278
00:25:38,770 --> 00:25:43,920
[Student answers, unintelligible] 

279
00:25:43,920 --> 00:25:45,920
[Student] What's that? 

280
00:25:45,920 --> 00:25:49,850
I will do it in 1 second.

281
00:25:49,850 --> 00:25:55,690
So now if I want to--I want to make sure the character is now lowercase, 

282
00:25:55,690 --> 00:26:04,140
and so I can OR by 32, and 32 meaning 'a' - A.

283
00:26:04,140 --> 00:26:06,510
But notice, by the same reasoning as the previous one, that if 

284
00:26:06,510 --> 00:26:11,670
the letter was already lowercase, then ORing by 32 just keeps it lowercase. 

285
00:26:11,670 --> 00:26:16,220
It hasn't changed the original character. 

286
00:26:16,220 --> 00:26:19,910
But now I don't have to avoid saying, "If it is lowercase, just forget about it, 

287
00:26:19,910 --> 00:26:23,650
if it's uppercase, then change it." 

288
00:26:23,650 --> 00:26:26,900
It's much more convenient to do this. 

289
00:26:26,900 --> 00:26:33,190
[Student] Would that strategy of subtracting the uppercase from the lowercase work if it weren't 32? 

290
00:26:33,190 --> 00:26:35,330
If it was, like, 34 or something?

291
00:26:35,330 --> 00:26:41,840
So, you need to know that the difference between the 2 is--? >>1 bit. 

292
00:26:41,840 --> 00:26:49,840
It could be more than 1 bit, as long as all of the bits below this position are the same. 

293
00:26:49,840 --> 00:26:58,500
So we need at least 26 characters--or, there are 26 characters. 

294
00:26:58,500 --> 00:27:04,590
So we need at least 26 numbers to represent the difference--

295
00:27:04,590 --> 00:27:07,650
The difference between A and 'a' has to be at least 26, 

296
00:27:07,650 --> 00:27:10,760
or else we wouldn't have represented all the capital numbers. 

297
00:27:10,760 --> 00:27:18,630
That means that A, if we start at 1, it's going to use all of these bits, 

298
00:27:18,630 --> 00:27:23,900
all of these first 5 bits, to represent everything through Z.

299
00:27:23,900 --> 00:27:32,170
That's why the next bit, or this bit, the next bit is the one that's chosen to distinguish between A and 'a.'

300
00:27:32,170 --> 00:27:40,930
That's also why, in ASCII table, there are 5 symbols separating capital letters from lowercase letters. 

301
00:27:40,930 --> 00:27:49,050
Since those are the symbols, the extra 5 that brings up the 32 being the difference between them. 

302
00:27:49,050 --> 00:27:51,840
[Student] So we could do it, because ASCII's designed that way. 

303
00:27:51,840 --> 00:27:57,280
Yes. But ASCII--the difference could also be both of these bits. 

304
00:27:57,280 --> 00:28:12,040
Like, if A were 10000001, and 'a' was 11100001--I forget, whatever. 

305
00:28:12,040 --> 00:28:18,100
But if it were this, then we could still use 'a' - A.

306
00:28:18,100 --> 00:28:22,650
It's just now the difference between A and 'a' is still these 2 bits. 

307
00:28:22,650 --> 00:28:32,240
I think it's written 48. Is it 32 + 64? I think it is? 

308
00:28:32,240 --> 00:28:40,160
It would still be 2 bits; every single character, like, Z and z, K and k,

309
00:28:40,160 --> 00:28:45,160
they would still have the same exact bits set except for those 2 bits. 

310
00:28:45,160 --> 00:28:48,870
So as long as that's always true, regardless of if we're using ASCII or some other system, 

311
00:28:48,870 --> 00:28:53,050
as long as there's only a set number of bits that are different for each character, 

312
00:28:53,050 --> 00:28:55,050
then that works fine. 

313
00:28:55,050 --> 00:29:06,110
It's just that 32 was set up because it's the first one we could possibly use. >>Cool. 

314
00:29:06,110 --> 00:29:14,520
I tend to prefer, in case you haven't seen, if the block is only a single line, 

315
00:29:14,520 --> 00:29:24,280
you can get rid of the curly braces; so I tend to prefer doing this. 

316
00:29:24,280 --> 00:29:34,010
Also, you know how we can do things like s[i] + = 1? 

317
00:29:34,010 --> 00:29:41,090
You can also do s[i] bitwise AND = 32. 

318
00:29:41,090 --> 00:29:46,400
And bitwise OR = 32.

319
00:29:46,400 --> 00:29:51,490
Also, count mod 2 = = 0. 

320
00:29:51,490 --> 00:30:00,900
So remember that--I won't write it--any non-zero value is true, and 0 is false. 

321
00:30:00,900 --> 00:30:07,880
So "if count mod 2 = = 0" is the same as saying "if not count mod 2."

322
00:30:07,880 --> 00:30:11,580
I probably would have just reversed the lines and said, "if count mod 2, 

323
00:30:11,580 --> 00:30:15,350
do the OR 1, else do the AND 1," so that I didn't need the "not."

324
00:30:15,350 --> 00:30:18,650
But this works just as well. 

325
00:30:18,650 --> 00:30:25,660
And what else can I do here? 

326
00:30:25,660 --> 00:30:29,060
You could combine them with ternary if you wanted, but then that'd just make things messier

327
00:30:29,060 --> 00:30:33,770
and probably more difficult to read, so we won't do that. 

328
00:30:33,770 --> 00:30:37,330
Anyone have any other suggestions?

329
00:30:37,330 --> 00:30:41,580
Is that all the problem asked for? Oh yeah. 

330
00:30:41,580 --> 00:30:51,070
So get rid of these empty lines, now we'll print f , % s being the one for strings, 

331
00:30:51,070 --> 00:30:56,620
We will print f, s. 

332
00:30:56,620 --> 00:30:59,330
Now let's run it. Did I do anything wrong? 

333
00:30:59,330 --> 00:31:03,200
That's a \ "; I want an n.

334
00:31:03,200 --> 00:31:07,840
Okay. Now we'll run it. It'll probably yell at me. 

335
00:31:07,840 --> 00:31:11,250
Strlen is in string.h. 

336
00:31:11,250 --> 00:31:14,290
So this is the nice thing about Clang is it tells you what it's in, 

337
00:31:14,290 --> 00:31:19,140
instead of GCC which just says, "Hey, you forgot something, I don't know what it was." 

338
00:31:19,140 --> 00:31:29,220
But this will tell me, "You meant to include string.h."

339
00:31:29,220 --> 00:31:32,130
So I didn't prompt for anything, so it's not saying anything. 

340
00:31:32,130 --> 00:31:42,540
But we'll do their example, "Thanks 4 the add". 

341
00:31:42,540 --> 00:31:47,880
That looks right. Hooray. 

342
00:31:47,880 --> 00:31:52,370
So returning to your main, I almost never do it. 

343
00:31:52,370 --> 00:31:57,110
It's optional. And main is the only function for which it is optional. 

344
00:31:57,110 --> 00:32:07,140
If you do not return anything from main, it's assumed that you meant to return 0. 

345
00:32:07,140 --> 00:32:13,070
Questions? 

346
00:32:13,070 --> 00:32:20,980
Okay. So now the second problem. 

347
00:32:20,980 --> 00:32:24,810
"Recall from week 2's second lecture that swapping 2 variables' values by passing 

348
00:32:24,810 --> 00:32:30,780
those 2 variables to a function (even if called swap) doesn't exactly work, at least not without 'pointers.'"

349
00:32:30,780 --> 00:32:37,020
And ignore pointers until we get to them. 

350
00:32:37,020 --> 00:32:40,070
We want to swap 2 variables; we're not using a function to do it.

351
00:32:40,070 --> 00:32:43,410
We're still going to do it in main like it says. 

352
00:32:43,410 --> 00:32:48,360
But to use those 2 variables, we don't want to use a temporary variable. 

353
00:32:48,360 --> 00:32:50,770
There are 2 ways to do this. 

354
00:32:50,770 --> 00:32:56,310
You can do it using your traditional binary operators.

355
00:32:56,310 --> 00:33:00,180
So does anyone know a quick and dirty way of doing that?

356
00:33:00,180 --> 00:33:07,650
It might actually take a minute of thinking. If I have--

357
00:33:07,650 --> 00:33:12,130
I'll set the problem up like they ask. So if I have 2 variables, A, which is just an integer

358
00:33:12,130 --> 00:33:17,800
that they give me, and sum variable B, which is another integer that I'm given.

359
00:33:17,800 --> 00:33:22,700
So if I have these 2 variables, now I want to swap them. 

360
00:33:22,700 --> 00:33:31,550
The traditional, using your regular binary operators, I mean,  like +, -, ÷.

361
00:33:31,550 --> 00:33:36,630
Not bitwise operators which act on binary. 

362
00:33:36,630 --> 00:33:39,600
So using -, +, ÷, and all those. 

363
00:33:39,600 --> 00:33:52,980
We could swap by doing something like a = a + b, and b = a - b, a = a - b. 

364
00:33:52,980 --> 00:34:04,260
So, sanity check, and then we'll see why that works. 

365
00:34:04,260 --> 00:34:13,320
Let's say a = 7, b = 3, then a + b is going to be 10. 

366
00:34:13,320 --> 00:34:18,820
So we're now setting a = 10, and then we're doing b = a - b. 

367
00:34:18,820 --> 00:34:30,250
So we're doing b = a - b, which is going to be 7, and b = a - b again, 

368
00:34:30,250 --> 00:34:38,650
or  a = a - b. Which is going to be 10 - 7 which is 3. 

369
00:34:38,650 --> 00:34:44,850
So now, correctly, 'a' was 7, b was 3, and now b is 7 and 'a' is 3. 

370
00:34:44,850 --> 00:34:48,679
So that kind of makes sense; 'a' is the combination of the 2 numbers. 

371
00:34:48,679 --> 00:34:53,000
At this point, 'a' is the combination, and then we're subtracting out the original b, 

372
00:34:53,000 --> 00:34:56,860
and then we're subtracting out what was the original 'a.' 

373
00:34:56,860 --> 00:35:01,150
But this does not work for all numbers. 

374
00:35:01,150 --> 00:35:08,880
To see this, let's consider a system; so we usually think of integers as 32 bits. 

375
00:35:08,880 --> 00:35:13,050
Let's work on something that's only like 4 bits. 

376
00:35:13,050 --> 00:35:15,450
Hopefully I come up with a good example right now. 

377
00:35:15,450 --> 00:35:18,680
So, I know, this will be easy. 

378
00:35:18,680 --> 00:35:26,720
Let's say our 2 numbers are 1111, and 1111; so we're in binary right now. 

379
00:35:26,720 --> 00:35:34,630
In actual decimals, if you want to think of it that way, a = 15 and b = 15. 

380
00:35:34,630 --> 00:35:37,630
And so we expect, after we swap them--they don't even have to be the same numbers, 

381
00:35:37,630 --> 00:35:41,140
but I did it this way. 

382
00:35:41,140 --> 00:35:47,100
Let's make them not the same numbers. Let's do 1111 and 0001. 

383
00:35:47,100 --> 00:35:51,860
So a = 15 and b = 1. 

384
00:35:51,860 --> 00:35:57,670
After we swap them, we expect 'a' to be 1 and b to be 15. 

385
00:35:57,670 --> 00:36:01,780
So our first step is a = a + b. 

386
00:36:01,780 --> 00:36:08,770
Our numbers are only 4 bits wide, so 'a,' which is 1111, + b, which is 0001, 

387
00:36:08,770 --> 00:36:16,780
is going to end up being 10000, but we only have 4 bits. 

388
00:36:16,780 --> 00:36:22,540
So now a = 0. 

389
00:36:22,540 --> 00:36:34,080
And now we want to set b = a - b--actually, this still works out perfectly. 

390
00:36:34,080 --> 00:36:39,630
a = a - b--let's see if this works out perfectly. 

391
00:36:39,630 --> 00:36:53,720
So then b = 0 - 1, which would still be 15, and then a = a - b, which would be 1. 

392
00:36:53,720 --> 00:36:56,210
Maybe this does work. 

393
00:36:56,210 --> 00:36:59,020
I feel like there's a reason it doesn't work using regular. 

394
00:36:59,020 --> 00:37:06,400
Okay, so working on the assumption that it doesn't work with regular binary operations, 

395
00:37:06,400 --> 00:37:15,040
and I will look for--I will Google to see if that is true. 

396
00:37:15,040 --> 00:37:23,490
So we want to do it using bitwise operators, and the clue here is XOR.

397
00:37:23,490 --> 00:37:28,780
So, introducing XOR (^) if you have not seen it yet.  

398
00:37:28,780 --> 00:37:34,610
It's, again, a bitwise operator so it acts bit by bit, and it's--

399
00:37:34,610 --> 00:37:39,910
If you have the bits 0 and 1, then this will be 1. 

400
00:37:39,910 --> 00:37:45,230
If you have the bits 1 and 0, it'll be 1, you have the bits 0 and 0 it'll be 0, 

401
00:37:45,230 --> 00:37:47,640
and if you have the bits 1 and 1 it'll be 0. 

402
00:37:47,640 --> 00:37:56,180
So it's like OR. If either of the bits are true, it's 1, but unlike OR, it can't be both bits that are true. 

403
00:37:56,180 --> 00:37:59,320
OR would have this be 1, XOR would have this be 0. 

404
00:37:59,320 --> 00:38:02,250
So we're going to want to use XOR here. 

405
00:38:02,250 --> 00:38:09,960
Think about it for a minute; I'm going to Google. 

406
00:38:09,960 --> 00:38:16,230
Well, you can't read that; I'm currently on the XOR swap algorithm page. 

407
00:38:16,230 --> 00:38:21,340
Hopefully this will explain why I can't--

408
00:38:21,340 --> 00:38:34,190
This is exactly the algorithm that we just did. 

409
00:38:34,190 --> 00:38:37,330
I still don't see why--I must have just picked a bad example, 

410
00:38:37,330 --> 00:38:44,940
but this case where 'a' happened to become 0, after getting to 5 bits, so now 'a' is 0, 

411
00:38:44,940 --> 00:38:48,730
that is what is called "integer overflow."

412
00:38:48,730 --> 00:38:54,370
According to Wikipedia, "Unlike the XOR swap, this variation requires that it uses some methods 

413
00:38:54,370 --> 00:38:59,780
to guarantee that x + y does not cause an integer overflow." 

414
00:38:59,780 --> 00:39:08,350
So this does have problems; this was integer overflow, but I did something wrong. 

415
00:39:08,350 --> 00:39:10,520
I'm not sure. I'll try to come up with another one. 

416
00:39:10,520 --> 00:39:13,640
[Student] Well, isn't integer overflow when you're trying to put a number in there 

417
00:39:13,640 --> 00:39:16,640
bigger than the amount of bits you have allocated?

418
00:39:16,640 --> 00:39:23,730
Yeah. We have 4 bits. That's--we had 4 bits, we then try to add 1 to it, so we end up with 5 bits. 

419
00:39:23,730 --> 00:39:26,690
But the fifth bit just gets cut off, yeah. 

420
00:39:26,690 --> 00:39:28,970
It might actually--

421
00:39:28,970 --> 00:39:33,010
[Student] Does that throw you an error, or does that--would that throw an error?

422
00:39:33,010 --> 00:39:40,720
No. So there's no error. When you get to the assembly level, a special bit

423
00:39:40,720 --> 00:39:47,020
somewhere is set that said there was an overflow, but in C you kind of just don't deal with that. 

424
00:39:47,020 --> 00:39:55,160
You actually can't deal with it unless you use special assembly instructions in C. 

425
00:39:55,160 --> 00:39:58,110
Let's think about XOR swap. 

426
00:39:58,110 --> 00:40:02,220
And I think the Wikipedia article might have also been saying that--

427
00:40:02,220 --> 00:40:07,310
So it also brought up modular arithmetic, so I guess I was, in theory, doing modular arithmetic 

428
00:40:07,310 --> 00:40:11,160
when I said that 0 - 1 is 15 again. 

429
00:40:11,160 --> 00:40:15,410
So that might actually--on a regular processor that does 0 - 1 = 15. 

430
00:40:15,410 --> 00:40:20,430
Since we end up at 0, we subtract 1, so then it just wraps back around to 1111. 

431
00:40:20,430 --> 00:40:28,930
So this algorithm might actually work, the a + b, the a - b, b - a; that might be fine. 

432
00:40:28,930 --> 00:40:34,030
But there's some processors which don't do that, and so it would not be fine in those specific ones. 

433
00:40:34,030 --> 00:40:39,880
XOR swap will work on any processor. Okay. 

434
00:40:39,880 --> 00:40:42,280
The idea is that it's supposed to be the same, though. 

435
00:40:42,280 --> 00:40:50,120
Where we are using XOR to somehow get the information of both into 1 of the variables, 

436
00:40:50,120 --> 00:40:54,120
and then pull out the information of the individual variables again. 

437
00:40:54,120 --> 00:41:04,330
So does anyone have ideas/the answer? 

438
00:41:04,330 --> 00:41:14,540
[Student answer, unintelligible]

439
00:41:14,540 --> 00:41:22,220
So this should work, and also, XOR is commutative. 

440
00:41:22,220 --> 00:41:27,620
Regardless of which order these 2 numbers happen to be in up here, 

441
00:41:27,620 --> 00:41:30,100
this result is going to be the same. 

442
00:41:30,100 --> 00:41:35,800
So a ^ b is b ^ a.

443
00:41:35,800 --> 00:41:51,860
You might also see this written as a ^ = b, b ^ = a, a ^ = b again. 

444
00:41:51,860 --> 00:42:00,200
So this is right, and to see why this works, think of the bits. 

445
00:42:00,200 --> 00:42:10,400
Using a smallish number, let's say 11001, and 01100. 

446
00:42:10,400 --> 00:42:12,790
So this is 'a'; this is b.

447
00:42:12,790 --> 00:42:15,540
So a ^ = b. 

448
00:42:15,540 --> 00:42:22,380
We're going to be setting 'a' = to the XOR of these 2 things. 

449
00:42:22,380 --> 00:42:32,920
So 1 ^ 0 is 1; 1 ^ 1 is 0; 0 ^ 1 is 1, and 0 ^ 0 is 0; 1 ^ 0 is 1. 

450
00:42:32,920 --> 00:42:37,380
So 'a,' if you look at the decimal number, it's going to be--

451
00:42:37,380 --> 00:42:41,160
you're not going to see much of a relationship between the original 'a' and the new 'a,' 

452
00:42:41,160 --> 00:42:45,600
but looking at the bits, 'a' is now like a mesh of the information

453
00:42:45,600 --> 00:42:49,970
of both the original 'a' and the original b. 

454
00:42:49,970 --> 00:42:57,930
So if we take b ^ a, we see that we'll end up at the original 'a.' 

455
00:42:57,930 --> 00:43:08,910
And if we take the original 'a' ^ the new 'a,' we see we end up at the original b.

456
00:43:08,910 --> 00:43:18,380
So (a ^ b) ^ b = the original 'a.' 

457
00:43:18,380 --> 00:43:27,910
And (a ^ b) ^ a = the original b.

458
00:43:27,910 --> 00:43:37,010
There is--another way of seeing this is anything XOR itself is always 0. 

459
00:43:37,010 --> 00:43:45,020
So 1101 ^ 1101, all the bits are going to be the same. 

460
00:43:45,020 --> 00:43:47,920
So there's never going to be a case where 1 is a 0 and the other is 1. 

461
00:43:47,920 --> 00:43:51,080
So this is 0000. 

462
00:43:51,080 --> 00:43:57,240
The same with this. (a ^ b) ^ b is like a ^ (b ^ b). 

463
00:43:57,240 --> 00:44:03,680
(b ^ b) is going to be 0; a ^ 0 is just going to be 'a,' since all the bits are 0. 

464
00:44:03,680 --> 00:44:08,050
So the only ones that are going to be where 'a' was originally a 1--had ones. 

465
00:44:08,050 --> 00:44:12,070
And the same idea here; I'm pretty sure it's also commutative. 

466
00:44:12,070 --> 00:44:17,590
Yeah. I did say before that it was commutative.

467
00:44:17,590 --> 00:44:24,680
The ^ 'a,' and it's associative, so now (b ^ a) ^ a.

468
00:44:24,680 --> 00:44:28,970
And we can do b ^ (a ^ a). 

469
00:44:28,970 --> 00:44:31,540
And so again, we get the original b.

470
00:44:31,540 --> 00:44:37,120
So 'a' is now the combination of 'a' and b together. 

471
00:44:37,120 --> 00:44:49,660
Using our new combo 'a' we say b = combo 'a' ^ the original b, we get the original 'a.' 

472
00:44:49,660 --> 00:45:05,170
And now a = combo 'a' ^ the new b, which was the original--or which is now what was 'a' or b.

473
00:45:05,170 --> 00:45:13,620
That's this case down here. This is = b, old b.

474
00:45:13,620 --> 00:45:16,550
So now everything is back in the swapped order. 

475
00:45:16,550 --> 00:45:22,960
If we actually looked at the bits, b = a ^ b, is going to XOR these 2, 

476
00:45:22,960 --> 00:45:33,920
and the answer is going to be this, and then a = a ^ b is XORing these 2 and the answer is this. 

477
00:45:33,920 --> 00:45:41,090
Questions? Okay. So the last one is somewhat significantly more difficult. 

478
00:45:41,090 --> 00:45:43,180
[Student] I think he has a question about it. >>Oh, sorry. 

479
00:45:43,180 --> 00:45:49,380
[Student] What's actually faster? If you use this XOR, or is it if you declare a new variable?

480
00:45:49,380 --> 00:45:55,190
So what is actually faster, declaring a new variable or using XOR to swap?

481
00:45:55,190 --> 00:45:59,600
The answer is, in all likelihood, a temporary variable. 

482
00:45:59,600 --> 00:46:05,780
And that is because once it's compiled down--so at the assembly level, 

483
00:46:05,780 --> 00:46:12,320
there's no such thing as local variables or any temporary variables or any of this stuff. 

484
00:46:12,320 --> 00:46:16,060
They're just like--there's memory, and there are registers. 

485
00:46:16,060 --> 00:46:20,920
Registers are where things are actively happening. 

486
00:46:20,920 --> 00:46:24,750
You don't add 2 things in memory; you add 2 things in registers. 

487
00:46:24,750 --> 00:46:28,160
And you bring things from memory into registers to then add them, 

488
00:46:28,160 --> 00:46:33,180
and then you might put them back into memory, but all the action happens in registers. 

489
00:46:33,180 --> 00:46:38,750
So when you're using the temporary variable approach, usually what happens is 

490
00:46:38,750 --> 00:46:42,810
these 2 numbers are already in registers. 

491
00:46:42,810 --> 00:46:46,570
And then from that point on, after you've swapped them, 

492
00:46:46,570 --> 00:46:51,540
it'll just start using the other register. 

493
00:46:51,540 --> 00:46:56,510
Anywhere you had been using b, it'll just use the register that was already storing 'a.' 

494
00:46:56,510 --> 00:47:02,180
So it doesn't need to do anything to actually do the swap. Yeah?

495
00:47:02,180 --> 00:47:05,690
[Student] But it also takes more memory, right? 

496
00:47:05,690 --> 00:47:10,280
It will only take more memory if it needs to store that temporary variable. 

497
00:47:10,280 --> 00:47:14,830
Like if you later use that temporary variable again somewhere, 

498
00:47:14,830 --> 00:47:18,920
then--or you assign something to that temporary variable. 

499
00:47:18,920 --> 00:47:24,630
So if at any point in time 'a,' b in temp have distinct values or something, 

500
00:47:24,630 --> 00:47:30,680
then it's going to have distinct locations in memory, but it is true that 

501
00:47:30,680 --> 00:47:34,800
there are many local variables which will only exist in registers. 

502
00:47:34,800 --> 00:47:44,370
In which case, it's never put into memory, and so you're never wasting memory. 

503
00:47:44,370 --> 00:47:58,620
Okay. Last question is a bit more. 

504
00:47:58,620 --> 00:48:04,850
So here, in this CS50 appliance, there is a dictionary. 

505
00:48:04,850 --> 00:48:12,390
And the reason for this is because [??B66] is a spell checker where you'll be writing

506
00:48:12,390 --> 00:48:15,780
using hash tables or tries or some data structure. 

507
00:48:15,780 --> 00:48:22,660
You're going to be writing a spell checker, and you're going to be using this dictionary to do that. 

508
00:48:22,660 --> 00:48:28,280
But for this problem, we are just going to look up to see if a single word is in the dictionary. 

509
00:48:28,280 --> 00:48:31,250
So instead of storing the entire dictionary in some data structure

510
00:48:31,250 --> 00:48:35,180
and then looking over an entire document to see if anything's misspelled, 

511
00:48:35,180 --> 00:48:38,490
we just want to find 1 word. So we can just scan over the entire dictionary 

512
00:48:38,490 --> 00:48:44,300
and if we never find the word in the entire dictionary, then it wasn't in there. 

513
00:48:44,300 --> 00:48:52,150
If we scan over the entire dictionary and do see the word, then we're good, we found it. 

514
00:48:52,150 --> 00:48:56,580
It says here that we want to start looking at C's file-handling function,

515
00:48:56,580 --> 00:48:59,930
since we want to read the dictionary, 

516
00:48:59,930 --> 00:49:07,680
but I will give the hint here as to which functions you should think of. 

517
00:49:07,680 --> 00:49:11,510
I'll write them on Spaces. 

518
00:49:11,510 --> 00:49:20,490
So the main ones you'll want to look at are f open and then, inevitably, f closed, 

519
00:49:20,490 --> 00:49:26,540
which will go at the end of your program, and f scan f. 

520
00:49:26,540 --> 00:49:31,060
You could also use f read, but you probably don't want to

521
00:49:31,060 --> 00:49:34,200
because that--you don't end up needing that. 

522
00:49:34,200 --> 00:49:41,880
F scan f is what you're going to be using to scan over the dictionary. 

523
00:49:41,880 --> 00:49:46,370
And so you don't need to code up the solution, just try and like pseudo-code your way 

524
00:49:46,370 --> 00:50:05,200
to a solution, and then we'll discuss it. 

525
00:50:05,200 --> 00:50:14,110
And actually, since I already gave you these, if you go into any terminal or your appliance's shell, 

526
00:50:14,110 --> 00:50:18,250
I would--I usually--if you have not seen yet, I don't know if you did in class, 

527
00:50:18,250 --> 00:50:23,490
but man, so the man pages, are pretty useful for looking at pretty much any function. 

528
00:50:23,490 --> 00:50:27,330
So I can do, like, man f, scan f. 

529
00:50:27,330 --> 00:50:32,300
This is now the info about the scan f family of functions. 

530
00:50:32,300 --> 00:50:37,070
I could also do man f, open, and that'll give me the details of that. 

531
00:50:37,070 --> 00:50:40,750
So if you know what function you are using, or you're reading code 

532
00:50:40,750 --> 00:50:43,000
and you see some function and you're like, "What does this do?" 

533
00:50:43,000 --> 00:50:45,280
Just man that function name. 

534
00:50:45,280 --> 00:50:47,340
There are a couple of weird examples where you might have to say

535
00:50:47,340 --> 00:50:51,620
like. man 2 that function name, or man 3 that function name, 

536
00:50:51,620 --> 00:50:58,230
but you only have to do that if man function name doesn't happen to work the first time. 

537
00:50:58,230 --> 00:51:03,010
[Student] So I'm reading the man page for open, but I'm still confused on how to use it and the program.

538
00:51:03,010 --> 00:51:06,170
Okay. A lot of the man pages are less than helpful. 

539
00:51:06,170 --> 00:51:08,470
They're more helpful if you already know what they do 

540
00:51:08,470 --> 00:51:12,670
and then you just need to remember the order of the arguments or something. 

541
00:51:12,670 --> 00:51:17,640
Or they can give you a general overview, but some of them are very overwhelming. 

542
00:51:17,640 --> 00:51:22,220
Like f scan f, also. It gives you the information for all of these functions, 

543
00:51:22,220 --> 00:51:28,120
and 1 line down here happens to say, "F scan f reads from the string point or stream." 

544
00:51:28,120 --> 00:51:32,360
But f open. So, how would we use f open? 

545
00:51:32,360 --> 00:51:38,470
The idea of a program which needs to do file I/O is that

546
00:51:38,470 --> 00:51:45,070
you first need to open the file you want to do things with, and inevitably, 

547
00:51:45,070 --> 00:51:51,220
read things from that file and do stuff with them. 

548
00:51:51,220 --> 00:51:55,350
F open is what we use to open the file. 

549
00:51:55,350 --> 00:52:04,190
The thing we get back, so what file do we want to open, it gives us the--

550
00:52:04,190 --> 00:52:11,970
in here it says "/user/share/dict/words." 

551
00:52:11,970 --> 00:52:16,740
This is the file that we want to open, and we want to open it--

552
00:52:16,740 --> 00:52:21,440
we have to explicitly specify whether we want to open it to read or if we want to open it to write. 

553
00:52:21,440 --> 00:52:26,490
There's a couple of combinations and stuff, but we want to open this for reading. 

554
00:52:26,490 --> 00:52:29,380
We want to read from the file. 

555
00:52:29,380 --> 00:52:34,290
So what does this return? It returns a file star (*), 

556
00:52:34,290 --> 00:52:37,260
and I'll just show everything in the variable f, so *, 

557
00:52:37,260 --> 00:52:40,840
again, it's a pointer, but we don't want to deal with pointers. 

558
00:52:40,840 --> 00:52:46,470
You can think of f as, f is now the variable you're going to use to represent the file. 

559
00:52:46,470 --> 00:52:49,850
So if you want to read from the file, you read from f.

560
00:52:49,850 --> 00:52:54,820
If you want to close the file, you close f. 

561
00:52:54,820 --> 00:53:00,350
So at the end of the program when we inevitably want to close the file, what should we do?

562
00:53:00,350 --> 00:53:06,750
We want to close f. 

563
00:53:06,750 --> 00:53:12,600
So now the last file function that we're going to want to use is scan f, f scan f. 

564
00:53:12,600 --> 00:53:20,930
And what that does is it scans over the file looking for a pattern to match. 

565
00:53:20,930 --> 00:53:39,100
Looking at the man page here, we see int f scan f, ignore the return value for now. 

566
00:53:39,100 --> 00:53:45,230
The first argument is the file * stream, so the first argument we're going to want to pass is f.

567
00:53:45,230 --> 00:53:47,900
We're scanning over f.

568
00:53:47,900 --> 00:53:53,680
The second argument is a format string. 

569
00:53:53,680 --> 00:53:58,310
I will give you a format string right now. 

570
00:53:58,310 --> 00:54:05,180
I think we happen to say, 127s\n, a lot of that's unnecessary.

571
00:54:05,180 --> 00:54:12,490
The idea of what that format string is, is you can think of scan f as the opposite of print f. 

572
00:54:12,490 --> 00:54:17,160
So print f, print f we also use this type of format parameter,

573
00:54:17,160 --> 00:54:25,000
but in print f what we're doing is--let's look at an equivalent. 

574
00:54:25,000 --> 00:54:32,550
So print f, and there's actually also f print f, where the first argument is going to be f. 

575
00:54:32,550 --> 00:54:40,980
When you print f, we could say something like, "print 127s\n" and then if we pass it some string, 

576
00:54:40,980 --> 00:54:44,050
it's going to print this string and then a new line. 

577
00:54:44,050 --> 00:54:49,690
What 127 means, I'm pretty sure, but I've never restricted myself to it, 

578
00:54:49,690 --> 00:54:52,470
You wouldn't even need to say '127' in the print f, 

579
00:54:52,470 --> 00:54:57,090
but what it means is print the first 127 characters. 

580
00:54:57,090 --> 00:54:59,350
So I'm pretty sure that's the case. You can Google for that. 

581
00:54:59,350 --> 00:55:03,000
But in the next one I'm almost positive it means that. 

582
00:55:03,000 --> 00:55:08,880
So this is print the first 127 characters, followed by a new line. 

583
00:55:08,880 --> 00:55:14,680
F scan f now, instead of looking at a variable and printing it, 

584
00:55:14,680 --> 00:55:22,620
it's going to look at some string, and store the pattern into the variable. 

585
00:55:22,620 --> 00:55:26,360
Let's actually use scan f in a different example. 

586
00:55:26,360 --> 00:55:31,670
So let's say we had some int, x = 4, 

587
00:55:31,670 --> 00:55:41,110
and we wanted to create a string made of--wanted to create the string 

588
00:55:41,110 --> 00:55:44,250
that was like, this will come up much later, 

589
00:55:44,250 --> 00:55:49,020
something that's just like 4.jpg. 

590
00:55:49,020 --> 00:55:51,870
So this might be a program where you will have sum counter, 

591
00:55:51,870 --> 00:55:56,420
sum counter i, and you want to save a bunch of images. 

592
00:55:56,420 --> 00:56:02,430
So you want to save i.jpg, where i is some iteration of your loop. 

593
00:56:02,430 --> 00:56:05,500
So how do we make this string for that JPEG?

594
00:56:05,500 --> 00:56:11,720
If you wanted to print 4.jpg, we could just say print f, % d.jpg, 

595
00:56:11,720 --> 00:56:14,410
and then it would print for that JPEG. 

596
00:56:14,410 --> 00:56:20,050
But if we want to save the string 4.jpg, we use scan f. 

597
00:56:20,050 --> 00:56:30,860
So string s--actually we can't--character, char s, let's go 100.

598
00:56:30,860 --> 00:56:35,400
So I just declared some array of 100 characters,

599
00:56:35,400 --> 00:56:39,830
and that's what we're inevitably going to be storing that JPEG in. 

600
00:56:39,830 --> 00:56:47,920
So we're going to use scan f, and the format, how we would say % d.jpg

601
00:56:47,920 --> 00:56:54,980
in order to print 4.jpg, the format of this is going to be % d.jpg. 

602
00:56:54,980 --> 00:57:04,020
So the format is % d.jpg, what we want to replace % d with is x, 

603
00:57:04,020 --> 00:57:06,590
and now we need to store that string somewhere. 

604
00:57:06,590 --> 00:57:12,500
And where we're going to store this string is in the array s. 

605
00:57:12,500 --> 00:57:21,640
So after this line of code, s, if we print f, % s of the variable s, 

606
00:57:21,640 --> 00:57:26,280
it's going to print 4.jpg. 

607
00:57:26,280 --> 00:57:38,930
So f scan f is the same as scan f, except now it's looking over this file 

608
00:57:38,930 --> 00:57:43,600
for what to store in s.

609
00:57:43,600 --> 00:57:46,160
That's what the last argument is going to be.

610
00:57:46,160 --> 00:57:54,170
We want to store--"Scan f family of functions scans in both according to format as tried below.

611
00:57:54,170 --> 00:58:02,450
If any are stored in the location points you might return--"

612
00:58:02,450 --> 00:58:12,910
No, we might be good. Let me think for a second. 

613
00:58:12,910 --> 00:58:26,350
So scan f does not--what the heck is the function which does that? 

614
00:58:26,350 --> 00:58:31,650
So scan f isn't going to take an integer and do dot jpg. 

615
00:58:31,650 --> 00:58:43,490
It's going to [mumbles]. 

616
00:58:43,490 --> 00:58:49,360
Save int variable in string int C. 

617
00:58:49,360 --> 00:58:55,940
What is this variable, or what is this function called? 

618
00:58:55,940 --> 00:59:04,950
Yes. That's--yes. So what I was defining to you before was s print f, 

619
00:59:04,950 --> 00:59:09,820
which--that makes much more sense, why I said it was much more like print f. 

620
00:59:09,820 --> 00:59:14,700
Scan f is still kind of like print f, but s print f is going to scan it over

621
00:59:14,700 --> 00:59:17,510
and replace the variables and now store it in a string.

622
00:59:17,510 --> 00:59:19,620
Instead of printing it, it stores it in a string. 

623
00:59:19,620 --> 00:59:25,070
So ignore that entirely. You can still think of the format specifier as like that of print f.

624
00:59:25,070 --> 00:59:34,510
So now, if we wanted to do the 4.jpg thing, we would do s print f, x of this. 

625
00:59:34,510 --> 00:59:38,520
So what scan f is doing--what was your question going to be? 

626
00:59:38,520 --> 00:59:40,820
[Student] I'm just confused on what we're trying to do right here

627
00:59:40,820 --> 00:59:43,450
with that JPEG. Can you explain that 1 more time? 

628
00:59:43,450 --> 00:59:52,710
So this was--it's less relevent to f scan f now; hopefully, it will tie back in some kind of way. 

629
00:59:52,710 --> 01:00:02,240
But what I initially was intending to show was--this is actually directly relevant to these [?? F5]

630
01:00:02,240 --> 01:00:08,520
You're going to be using s print f, where, say we have 100 images, 

631
01:00:08,520 --> 01:00:13,630
and you want to read image 1.jpg, 2.jpg, 3.jpg. 

632
01:00:13,630 --> 01:00:21,520
So in order to do that, you need to f open, and then you have to pass in the string that you want to open.

633
01:00:21,520 --> 01:00:30,020
So we would want to open 1.jpg; in order to create the string that is 1.jpg, 

634
01:00:30,020 --> 01:00:37,660
we do s print f of % d.jpg--we didn't do for int i = 0. 

635
01:00:37,660 --> 01:00:46,580
i < 40, i + +. 

636
01:00:46,580 --> 01:00:51,130
So s print f % d.jpg of i. 

637
01:00:51,130 --> 01:00:56,320
So after this line, now the variable or the array s is going to 1.jpg. 

638
01:00:56,320 --> 01:01:10,610
Or, 0.jpg, 1.jpg, 2.jpg. And so we can open, in turn, each image for reading. 

639
01:01:10,610 --> 01:01:19,550
So that is what s print f does. Do you see what s print f is now doing?

640
01:01:19,550 --> 01:01:25,720
[Student] Okay, so it's taking--it creates a string, something.jpg, and then stores it.

641
01:01:25,720 --> 01:01:30,360
Yes. It creates--this is another format string, just like scan f and print f, 

642
01:01:30,360 --> 01:01:37,530
where it inserts all of the variables into the second argument, might be s as opposed to i. 

643
01:01:37,530 --> 01:01:42,280
Perhaps--I mean, that's the case. But whatever the order of arguments is. 

644
01:01:42,280 --> 01:01:45,440
It's going to insert all of the variables into the format string 

645
01:01:45,440 --> 01:01:52,250
and then store into our buffer; we call that a buffer, it's where we're storing the string. 

646
01:01:52,250 --> 01:02:00,750
So we are storing inside of s the correctly-formatted string, % d having been replaced with 4. 

647
01:02:00,750 --> 01:02:08,080
[Student] So if we did this, is the variable f just going to be reassigned? 

648
01:02:08,080 --> 01:02:18,110
Yes. So we should close the original f before doing this. 

649
01:02:18,110 --> 01:02:22,810
But--and then also, if there were not an f open up here, then we would need to say--

650
01:02:22,810 --> 01:02:29,280
Yeah. But it would open a hundred different files.  

651
01:02:29,280 --> 01:02:37,360
[Student] But we wouldn't be able to access or--okay. 

652
01:02:37,360 --> 01:02:44,230
Okay. So scan f, f scan f, is kind of the same idea, 

653
01:02:44,230 --> 01:02:53,610
but instead of, instead of storing it into a string, it's more like you are now 

654
01:02:53,610 --> 01:03:02,420
going over a sting and pattern matching against that string and storing the results into variables. 

655
01:03:02,420 --> 01:03:11,290
You can use scan f to parse over something like 4.jpg, and store the integer 4 into sum int x. 

656
01:03:11,290 --> 01:03:13,430
That's what we can use scan f for. 

657
01:03:13,430 --> 01:03:16,300
F scan f is going to do that at the command line. 

658
01:03:16,300 --> 01:03:19,200
I'm actually pretty sure this is what the CS50 library does. 

659
01:03:19,200 --> 01:03:29,050
So when you say, "get int," it's scan f-ing over--scan f is the way you get user input. 

660
01:03:29,050 --> 01:03:34,670
F scan f is going to do the same thing but using a file to scan over. 

661
01:03:34,670 --> 01:03:41,090
So here, we are scanning over this file. 

662
01:03:41,090 --> 01:03:45,460
The pattern we are trying to match is some string that is 127 characters long 

663
01:03:45,460 --> 01:03:48,100
followed by a new line 

664
01:03:48,100 --> 01:03:54,770
So I'm pretty sure we could even just say "match s," since in the dictionary

665
01:03:54,770 --> 01:03:57,770
we happen to have, we're guaranteed no word is that long, 

666
01:03:57,770 --> 01:04:03,310
and also f scan f, I think, will stop at the new line no matter what. 

667
01:04:03,310 --> 01:04:06,970
But we'll include the new line in the match, and--

668
01:04:06,970 --> 01:04:13,960
[Student] If we didn't include the new line, wouldn't it find parts of a word? 

669
01:04:13,960 --> 01:04:22,900
It--each--looking at the dictionary--

670
01:04:22,900 --> 01:04:26,200
So in the dictionary, these are all of our words. 

671
01:04:26,200 --> 01:04:30,500
Each one is on a new line. 

672
01:04:30,500 --> 01:04:32,510
The scan f is going to pick up this word. 

673
01:04:32,510 --> 01:04:38,750
If we don't include the new line, then it's possible that the next scan f will just read the new line. 

674
01:04:38,750 --> 01:04:44,180
But including new line then will just ignore the new line. 

675
01:04:44,180 --> 01:04:49,440
But we'll never get part of a word, since we are always reading up to a new line, no matter what. 

676
01:04:49,440 --> 01:04:54,530
[Student] But what if you search for the word "cissa," like c-i-s-s-a.   

677
01:04:54,530 --> 01:04:57,380
Will it find that, and say it's a match? 

678
01:04:57,380 --> 01:05:05,110
So here we--it will read in--this is actually a good point. 

679
01:05:05,110 --> 01:05:10,660
We're never using the current--the word we're looking for is the first command line argument.

680
01:05:10,660 --> 01:05:16,460
So string, word = argv 1. 

681
01:05:16,460 --> 01:05:20,020
So the string we're looking for is argv 1. 

682
01:05:20,020 --> 01:05:23,290
We are not looking for a word at all in our scan f. 

683
01:05:23,290 --> 01:05:28,030
What we were doing with scan f is getting each word in the dictionary, 

684
01:05:28,030 --> 01:05:34,320
and then once we have that word we're going to use strcmp to compare them. 

685
01:05:34,320 --> 01:05:39,210
We're going to compare our word and what we just read in. 

686
01:05:39,210 --> 01:05:45,110
So inevitably, we're going to end up doing a bunch of scan fs 

687
01:05:45,110 --> 01:05:52,130
until it just so happens that scan f will return--

688
01:05:52,130 --> 01:05:54,800
it will return one, as long as it has matched a new word, 

689
01:05:54,800 --> 01:06:01,360
and it will return something else as soon as it has failed to match the word. 

690
01:06:01,360 --> 01:06:08,440
We are reading over the entire dictionary, storing line by line each word into the variable s. 

691
01:06:08,440 --> 01:06:17,240
Then we are comparing word with s, and if the comparison = = 0, 

692
01:06:17,240 --> 01:06:21,650
strcmp happens to bring 0 if a match was made. 

693
01:06:21,650 --> 01:06:31,510
So if it was 0, then we can print f, matched, 

694
01:06:31,510 --> 01:06:35,370
or word is in dictionary, or whatever you want to print f. 

695
01:06:35,370 --> 01:06:41,450
And then--we don't want to f close over and over again. 

696
01:06:41,450 --> 01:06:50,410
This is the kind of thing we want to do, and we are not just looking for word in the dictionary. 

697
01:06:50,410 --> 01:06:56,660
So we could do that, if we wanted to look for their pattern, c-i-s-s-a, like you said before,  

698
01:06:56,660 --> 01:07:00,260
if we wanted to look for that pattern, then it would fail in the case 

699
01:07:00,260 --> 01:07:08,010
because that's not actually a word, but one of the words in the dictionary happens to have that in it. 

700
01:07:08,010 --> 01:07:13,560
So it would match this word, but this subset of the word is not a word itself. 

701
01:07:13,560 --> 01:07:17,250
But that's not how we're using it; we're reading in each word 

702
01:07:17,250 --> 01:07:19,740
and then comparing the word we have with that word. 

703
01:07:19,740 --> 01:07:25,780
So we're always comparing full words. 

704
01:07:25,780 --> 01:07:29,620
I can send out the finalized solutions later. 

705
01:07:29,620 --> 01:07:32,050
This is kind of nearly the right answer, I think. 

706
01:07:32,050 --> 01:07:34,720
[Student comment, unintelligible]

707
01:07:34,720 --> 01:07:40,870
Oh, did I get rid of that before? Char s, I guess we said 127--I forget what the largest is. 

708
01:07:40,870 --> 01:07:44,100
We'll just do 128; so now s is long enough.

709
01:07:44,100 --> 01:07:46,570
We don't need to print anything. 

710
01:07:46,570 --> 01:07:56,440
We're also going to want to have to close our file, and that should be about the right answer. 

711
01:07:56,440 --> 01:07:59,440
CS50.TV