1
00:00:00,000 --> 00:00:02,862


2
00:00:02,862 --> 00:00:03,910
CARTER ZENKE: OK.

3
00:00:03,910 --> 00:00:04,940
Hello, everyone.

4
00:00:04,940 --> 00:00:07,982
It is so good to see you
here for our Week 5 Section.

5
00:00:07,982 --> 00:00:09,940
If you don't remember,
my name is Carter Zenke.

6
00:00:09,940 --> 00:00:11,740
I'm the course's
preceptor here on campus,

7
00:00:11,740 --> 00:00:13,902
and it's so delightful
to see you all over Zoom.

8
00:00:13,902 --> 00:00:15,860
I thought what we'd do
just to kick things off,

9
00:00:15,860 --> 00:00:18,880
assuming you can hear me this
time, is to on the count of three,

10
00:00:18,880 --> 00:00:22,150
we'll all unmute at the same
time and say a big hello.

11
00:00:22,150 --> 00:00:25,440
So on the count of three, we'll
do one, two, three, hello.

12
00:00:25,440 --> 00:00:26,830
AUDIENCE: Hello.

13
00:00:26,830 --> 00:00:28,195
AUDIENCE: Hello.

14
00:00:28,195 --> 00:00:29,455
AUDIENCE: Hello.

15
00:00:29,455 --> 00:00:31,640
AUDIENCE: Hello.

16
00:00:31,640 --> 00:00:32,800
CARTER ZENKE: All right.

17
00:00:32,800 --> 00:00:33,220
AUDIENCE: Hello.

18
00:00:33,220 --> 00:00:34,585
CARTER ZENKE: It is so wonderful--

19
00:00:34,585 --> 00:00:35,252
AUDIENCE: Hello.

20
00:00:35,252 --> 00:00:36,481
AUDIENCE: Hello.

21
00:00:36,481 --> 00:00:37,690
AUDIENCE: Hello.

22
00:00:37,690 --> 00:00:38,710
CARTER ZENKE: Hello.

23
00:00:38,710 --> 00:00:40,533
It is so wonderful to see you all here.

24
00:00:40,533 --> 00:00:41,200
AUDIENCE: Hello.

25
00:00:41,200 --> 00:00:42,968
AUDIENCE: Hello.

26
00:00:42,968 --> 00:00:46,280
CARTER ZENKE: And we'll
go ahead and mute.

27
00:00:46,280 --> 00:00:48,760
AUDIENCE: Hello.

28
00:00:48,760 --> 00:00:52,000
CARTER ZENKE: OK.

29
00:00:52,000 --> 00:00:52,500
All right.

30
00:00:52,500 --> 00:00:55,208
So if you'd like to participate,
feel free to do so via the chat.

31
00:00:55,208 --> 00:00:58,405
You're welcome to type in the chat
to ask questions via the chat.

32
00:00:58,405 --> 00:01:00,780
Again, the goal for these
sections is really to make sure

33
00:01:00,780 --> 00:01:03,090
that you're able to ask the
questions you want to ask

34
00:01:03,090 --> 00:01:06,250
and for us to dive into
this learning together.

35
00:01:06,250 --> 00:01:08,460
So today, focused on data structures.

36
00:01:08,460 --> 00:01:12,540
And this week, you build your own data
structure with a speller problem set.

37
00:01:12,540 --> 00:01:16,170
And I want to focus on this big
idea behind data structures.

38
00:01:16,170 --> 00:01:17,087
Why would we use them?

39
00:01:17,087 --> 00:01:18,962
What are the trade-offs
between them, and how

40
00:01:18,962 --> 00:01:20,800
to actually use them in the real world.

41
00:01:20,800 --> 00:01:23,680
So to that end, we'll have three
questions during our section today.

42
00:01:23,680 --> 00:01:27,360
The first will be this idea of the
trade-off between data structures.

43
00:01:27,360 --> 00:01:30,180
When should we use one data
structure versus another?

44
00:01:30,180 --> 00:01:32,790
The second question will be
about these linked lists.

45
00:01:32,790 --> 00:01:36,660
We saw these in lecture, but how
can we build our own linked lists?

46
00:01:36,660 --> 00:01:37,920
What can we do with them?

47
00:01:37,920 --> 00:01:42,480
And how can we better figure out how
to actually use them in our code?

48
00:01:42,480 --> 00:01:44,670
And then, finally, our
third question will

49
00:01:44,670 --> 00:01:48,230
be this idea of how do we build our
own hash table where hash tables rely

50
00:01:48,230 --> 00:01:50,730
on a lot of fundamentals that
linked lists can give us here,

51
00:01:50,730 --> 00:01:52,688
but can run maybe a little
bit faster, as we'll

52
00:01:52,688 --> 00:01:55,900
see in the problem set for this week.

53
00:01:55,900 --> 00:01:59,220
So to really dive into things here and
to make things a little more concrete,

54
00:01:59,220 --> 00:02:01,500
we saw all kinds of data structures.

55
00:02:01,500 --> 00:02:04,780
But often, when you're actually
programming in the real world,

56
00:02:04,780 --> 00:02:08,220
you might choose one or another,
depending on the scenario, what

57
00:02:08,220 --> 00:02:09,730
you're trying to build something.

58
00:02:09,730 --> 00:02:12,480
And in this case, what I'd like
to do is show you all one scenario

59
00:02:12,480 --> 00:02:13,690
we can think about here.

60
00:02:13,690 --> 00:02:17,220
So imagine that you are working
for some software company

61
00:02:17,220 --> 00:02:21,190
and they are making this digital
assistant that runs on a mobile device.

62
00:02:21,190 --> 00:02:23,610
Maybe it's something like,
hey, Siri, or hey, Google,

63
00:02:23,610 --> 00:02:25,680
or something that
responds to people's voice

64
00:02:25,680 --> 00:02:27,630
over time on their personal device.

65
00:02:27,630 --> 00:02:31,230
You can do something for that person
as they activate this assistant.

66
00:02:31,230 --> 00:02:35,430
Now, you've seen, or you've heard, that
people who are using this assistant

67
00:02:35,430 --> 00:02:38,640
aren't often able to wake
it up at the right moment.

68
00:02:38,640 --> 00:02:42,053
Maybe it sometimes doesn't work
if I were to say maybe, hey, Siri,

69
00:02:42,053 --> 00:02:42,720
or, hey, Google.

70
00:02:42,720 --> 00:02:44,830
It doesn't quite work as I want it to.

71
00:02:44,830 --> 00:02:47,953
And so this is sort of a problem
that we're running into here.

72
00:02:47,953 --> 00:02:50,370
But what we ultimately want
to do is figure out, OK, well,

73
00:02:50,370 --> 00:02:51,690
how could we fix this?

74
00:02:51,690 --> 00:02:54,750
And maybe the first solution
is to try to improve this

75
00:02:54,750 --> 00:02:58,200
by adding more words that could
wake up this voice assistant.

76
00:02:58,200 --> 00:03:02,280
We want to store more data, more
words, that could activate this voice

77
00:03:02,280 --> 00:03:03,330
assistant for ourselves.

78
00:03:03,330 --> 00:03:06,372
And the question that you might run
into as a software engineer is, well,

79
00:03:06,372 --> 00:03:08,820
how could we best store
all of these phrases

80
00:03:08,820 --> 00:03:11,282
we want our voice
assistant to wake up to?

81
00:03:11,282 --> 00:03:12,990
What kind of data
structure would you use

82
00:03:12,990 --> 00:03:17,190
to actually store these phrases that
might activate our voice assistant?

83
00:03:17,190 --> 00:03:21,090
And we saw a few tools in our toolkit,
a few structures in our toolkit.

84
00:03:21,090 --> 00:03:25,590
And one of these was the linked list,
one of these was the try, one of these

85
00:03:25,590 --> 00:03:27,370
was the tree, and so on.

86
00:03:27,370 --> 00:03:29,260
So there's lots of ways we can do this.

87
00:03:29,260 --> 00:03:32,550
But let's think first about the
different priorities we might

88
00:03:32,550 --> 00:03:34,980
have for this data structure, right?

89
00:03:34,980 --> 00:03:36,822
If we want to actually make this.

90
00:03:36,822 --> 00:03:38,530
Well, what do we want
to prioritize here?

91
00:03:38,530 --> 00:03:39,698
So let's think about this.

92
00:03:39,698 --> 00:03:41,490
If we to add lots of
words, we could really

93
00:03:41,490 --> 00:03:43,650
think of trying a few
different approaches here.

94
00:03:43,650 --> 00:03:46,023
We could think of, well, a
few different aspects, like,

95
00:03:46,023 --> 00:03:48,690
if our data structure is going
to do a certain number of things,

96
00:03:48,690 --> 00:03:51,300
we might care about different aspects
of how data structures are formed.

97
00:03:51,300 --> 00:03:53,400
Like, do we care about
deleting things quickly?

98
00:03:53,400 --> 00:03:55,170
Do we care about
inserting things quickly?

99
00:03:55,170 --> 00:03:57,128
Do we care about searching
for phrases quickly?

100
00:03:57,128 --> 00:03:59,520
Like, for example, if I were
to have this voice assistant

101
00:03:59,520 --> 00:04:03,990
and I'm trying to do something with it,
do I care about deleting words quickly,

102
00:04:03,990 --> 00:04:07,230
or do I care about
inserting new words quickly,

103
00:04:07,230 --> 00:04:10,830
or do I care about maybe if I were
to speak something, if I can search

104
00:04:10,830 --> 00:04:12,270
and find that word really quickly?

105
00:04:12,270 --> 00:04:14,220
So often, when we're
choosing data structure,

106
00:04:14,220 --> 00:04:15,928
we care about these
kind of three things.

107
00:04:15,928 --> 00:04:18,820
How fast does it perform
in these three areas?

108
00:04:18,820 --> 00:04:21,328
And so here, we run into
our first question, right?

109
00:04:21,328 --> 00:04:23,370
If we want to figure out
which way to prioritize,

110
00:04:23,370 --> 00:04:26,320
well, we could maybe say that
we care about search the most.

111
00:04:26,320 --> 00:04:28,920
We want to be able to search and
find a phrase really quickly.

112
00:04:28,920 --> 00:04:32,190
Maybe if the user speaks some word,
we want to be able to figure out,

113
00:04:32,190 --> 00:04:35,640
is that word in the wake word list.

114
00:04:35,640 --> 00:04:37,590
Or maybe we really care
about the user being

115
00:04:37,590 --> 00:04:41,010
able to quickly add a new wake word,
like, I could say not just, hey,

116
00:04:41,010 --> 00:04:42,450
but hello, or hi.

117
00:04:42,450 --> 00:04:45,450
And I want to be able to program this
data structure to sort of respond

118
00:04:45,450 --> 00:04:47,200
to that quickly.

119
00:04:47,200 --> 00:04:51,180
Now, what priorities might you have
if you were faced with a scenario?

120
00:04:51,180 --> 00:04:53,130
If you want to chime
in in the chat, which

121
00:04:53,130 --> 00:04:56,490
would you prioritize here between
a fast insertion, a fast search,

122
00:04:56,490 --> 00:04:59,380
a fast deletion, if
you're trying to do this?

123
00:04:59,380 --> 00:05:02,390
Feel free to chime in in the chat.

124
00:05:02,390 --> 00:05:02,890
OK.

125
00:05:02,890 --> 00:05:09,880
I'm seeing one person for search, OK,
lots of searching, some insertion, too.

126
00:05:09,880 --> 00:05:10,380
Yeah.

127
00:05:10,380 --> 00:05:13,380
And so these are things we could
probably reasonably disagree on, right?

128
00:05:13,380 --> 00:05:16,078
If I want to prioritize
searching for some word,

129
00:05:16,078 --> 00:05:19,120
being able to speak it and have my
assistant recognize it really quickly,

130
00:05:19,120 --> 00:05:22,037
well, that seems reasonable where
we can make search our top priority.

131
00:05:22,037 --> 00:05:24,400
Or maybe it's also
reasonable to say, well,

132
00:05:24,400 --> 00:05:26,723
I want people to make their
own custom wake words.

133
00:05:26,723 --> 00:05:29,140
And so inserting something is
really important to me, too.

134
00:05:29,140 --> 00:05:30,820
So it depends on what
you're trying to build here.

135
00:05:30,820 --> 00:05:33,600
And so let's take a look at some
of these tools in our tool kit.

136
00:05:33,600 --> 00:05:37,020
So the first one we said before
was this idea of the linked list.

137
00:05:37,020 --> 00:05:39,437
And here's the linked list in full here.

138
00:05:39,437 --> 00:05:41,520
And the idea behind the
linked list was that we're

139
00:05:41,520 --> 00:05:45,352
able to have these nodes where
each node has a certain phrase

140
00:05:45,352 --> 00:05:48,060
inside of it-- at least in this
case, trying to store phrases now

141
00:05:48,060 --> 00:05:49,020
in our linked list.

142
00:05:49,020 --> 00:05:51,760
And we have these nodes kind
of follow each other in memory.

143
00:05:51,760 --> 00:05:53,590
They're no longer back-to-back,
like, in an array,

144
00:05:53,590 --> 00:05:54,780
but they sort of build
on top of each other

145
00:05:54,780 --> 00:05:57,360
and sort of follow each other
throughout memory, being woven

146
00:05:57,360 --> 00:05:59,940
in to our bits and bytes, thereof.

147
00:05:59,940 --> 00:06:05,850
So if we were to use a linked list
here, which of these three priorities

148
00:06:05,850 --> 00:06:08,130
would be kind of
well-served by this linked

149
00:06:08,130 --> 00:06:12,983
list, like, what would a link list be
good at in this case, do you think?

150
00:06:12,983 --> 00:06:14,275
Feel free to chime in the chat.

151
00:06:14,275 --> 00:06:17,910


152
00:06:17,910 --> 00:06:18,410
OK.

153
00:06:18,410 --> 00:06:19,850
So I'm seeing some insertion.

154
00:06:19,850 --> 00:06:21,330
And, yeah, it's correct.

155
00:06:21,330 --> 00:06:23,720
So linked would be really
fast to insert something into.

156
00:06:23,720 --> 00:06:27,500
In fact, we saw in lecture that all
it takes to insert into a linked list

157
00:06:27,500 --> 00:06:30,588
is to simply make a new node
and put it at the very front.

158
00:06:30,588 --> 00:06:32,630
And that's, basically, a
constant time operation.

159
00:06:32,630 --> 00:06:34,100
It takes a fixed number of steps.

160
00:06:34,100 --> 00:06:36,290
So inserting into
linked list is very fast

161
00:06:36,290 --> 00:06:39,447
and could be a good priority
for our insertion here.

162
00:06:39,447 --> 00:06:41,030
But, again, what's the trade-off here?

163
00:06:41,030 --> 00:06:42,738
If we were to go with
a linked list, what

164
00:06:42,738 --> 00:06:46,130
would be slower because we
used the linked list now?

165
00:06:46,130 --> 00:06:49,130
Maybe insertion is fast,
but what's not fast?

166
00:06:49,130 --> 00:06:51,507
Well, deletion is not very fast, right?

167
00:06:51,507 --> 00:06:53,840
But at least in this scenario,
let's say we don't really

168
00:06:53,840 --> 00:06:57,290
want to care about deleting wake words,
what's the other ones we can use,

169
00:06:57,290 --> 00:07:00,435
a little bit slow for us?

170
00:07:00,435 --> 00:07:00,935
Yeah.

171
00:07:00,935 --> 00:07:02,102
So I'm seeing search, right?

172
00:07:02,102 --> 00:07:04,100
Search would be pretty
slow because, again, we

173
00:07:04,100 --> 00:07:06,410
have to sort of start
at the very first node.

174
00:07:06,410 --> 00:07:09,442
Like, hey, and ask, OK, is
this what the user said?

175
00:07:09,442 --> 00:07:11,150
No, it's not, let's
go onto the next one.

176
00:07:11,150 --> 00:07:12,380
Oh, is this what the user said?

177
00:07:12,380 --> 00:07:14,088
Oh, it's not, let's
go onto the next one.

178
00:07:14,088 --> 00:07:16,778
So we can only do linear search,
basically, with a linked list.

179
00:07:16,778 --> 00:07:19,070
And this is a bit of
simplification for how we actually

180
00:07:19,070 --> 00:07:20,330
do kind of voice recognition.

181
00:07:20,330 --> 00:07:22,205
But you get the idea,
nonetheless, that we're

182
00:07:22,205 --> 00:07:25,040
trying to find and go through phrases
that the user could have said.

183
00:07:25,040 --> 00:07:27,840
In a linked list, we have to go
through every single node that we have,

184
00:07:27,840 --> 00:07:29,632
which, as I see in the
chat is indeed going

185
00:07:29,632 --> 00:07:32,220
to be in big O of n
going through n nodes

186
00:07:32,220 --> 00:07:35,440
to find a certain element inside of it.

187
00:07:35,440 --> 00:07:36,960
OK so we have linked lists here.

188
00:07:36,960 --> 00:07:39,040
But let's actually think
of a new data structure.

189
00:07:39,040 --> 00:07:41,802
What if we use something
like the hash table?

190
00:07:41,802 --> 00:07:43,510
And the hash table
looks a bit like this.

191
00:07:43,510 --> 00:07:46,020
I'm going to go to the
full screen mode here where

192
00:07:46,020 --> 00:07:47,760
it's a bit like a linked list.

193
00:07:47,760 --> 00:07:49,620
But what we've done
is sort of categorize

194
00:07:49,620 --> 00:07:52,242
our linked list based on
the very first character.

195
00:07:52,242 --> 00:07:54,450
And not all hash tables use
the very first character,

196
00:07:54,450 --> 00:07:58,260
but this one, for example, does,
and it is helpful in some ways.

197
00:07:58,260 --> 00:08:02,540
Like what would now be faster
because we're using the hash table?

198
00:08:02,540 --> 00:08:05,368
What would be faster now?

199
00:08:05,368 --> 00:08:06,660
Feel free to write in the chat.

200
00:08:06,660 --> 00:08:08,940


201
00:08:08,940 --> 00:08:09,440
OK.

202
00:08:09,440 --> 00:08:11,810
So I'm seeing some search,
like, search is faster.

203
00:08:11,810 --> 00:08:15,770
And why is search faster, if you
want to write in the chat quickly?

204
00:08:15,770 --> 00:08:18,425
Why would search be faster
with this hash table now?

205
00:08:18,425 --> 00:08:24,120


206
00:08:24,120 --> 00:08:26,790
So I'm seeing some of these
ideas of indexing, right?

207
00:08:26,790 --> 00:08:27,660
And so-- right.

208
00:08:27,660 --> 00:08:31,770
If we look at the side of our hash table
in this array on this left-hand side

209
00:08:31,770 --> 00:08:36,659
here, we've kind of split our linked
list into several indexes, like, H, I,

210
00:08:36,659 --> 00:08:38,010
J, K and L.

211
00:08:38,010 --> 00:08:41,400
And in the computer's memory, these
are represented with, actually, like,

212
00:08:41,400 --> 00:08:44,100
H, I, J, K and L. They're
assigned some number.

213
00:08:44,100 --> 00:08:46,770
And we have a hash function
that finds the number for each

214
00:08:46,770 --> 00:08:48,400
of these values on the left-hand side.

215
00:08:48,400 --> 00:08:51,430
But the idea here is that we're
able to use indexing to figure out,

216
00:08:51,430 --> 00:08:54,180
OK, if I want to find the
words beginning with H,

217
00:08:54,180 --> 00:08:57,878
I know right where to go, and
I can only search those words.

218
00:08:57,878 --> 00:09:01,170
Or if I want to find the words beginning
with L, I can search in that location,

219
00:09:01,170 --> 00:09:03,630
just find the words beginning
with L. So indexing here

220
00:09:03,630 --> 00:09:05,422
is helping us breaking
down our linked list

221
00:09:05,422 --> 00:09:08,170
into lots of smaller linked lists, too.

222
00:09:08,170 --> 00:09:12,070
So it seems if we were to
insert into this hash table,

223
00:09:12,070 --> 00:09:13,380
it's also pretty quick, right?

224
00:09:13,380 --> 00:09:16,770
It's both faster searching
and faster insertion.

225
00:09:16,770 --> 00:09:20,590
So why would we just not always use
a hash table over a linked list?

226
00:09:20,590 --> 00:09:22,590
Like what's the problem
with the hash table now?

227
00:09:22,590 --> 00:09:23,730
What's the trade-off here?

228
00:09:23,730 --> 00:09:26,060


229
00:09:26,060 --> 00:09:26,560
Yeah.

230
00:09:26,560 --> 00:09:28,320
So I'm seeing this idea of memory.

231
00:09:28,320 --> 00:09:31,570
Like this hash table is going to take
up much more memory than our linked list

232
00:09:31,570 --> 00:09:35,100
assuming, maybe, we have some buckets,
some indexes that aren't really filled

233
00:09:35,100 --> 00:09:36,770
with these phrases.

234
00:09:36,770 --> 00:09:38,520
But it's also going
to take us more memory

235
00:09:38,520 --> 00:09:40,582
to have all the different
indexes possible here

236
00:09:40,582 --> 00:09:43,540
that we might want to actually have
in order to make our lookup really,

237
00:09:43,540 --> 00:09:44,370
really fast.

238
00:09:44,370 --> 00:09:47,640
So it's a trade-off
here is memory, as well.

239
00:09:47,640 --> 00:09:49,850
And now, it's got one other
structure to consider,

240
00:09:49,850 --> 00:09:52,010
which would be this idea of the trie.

241
00:09:52,010 --> 00:09:54,240
And here's some brief visual of a trie.

242
00:09:54,240 --> 00:09:57,980
Would you like to point out some
words you see inside of this trie,

243
00:09:57,980 --> 00:09:59,253
reading from left to right?

244
00:09:59,253 --> 00:10:00,170
What words do you see?

245
00:10:00,170 --> 00:10:02,960
You see hello, you see hey.

246
00:10:02,960 --> 00:10:04,400
Any other ones?

247
00:10:04,400 --> 00:10:07,910
We see help, certainly.

248
00:10:07,910 --> 00:10:09,980
Yeah, there's a few
other ones in here, too.

249
00:10:09,980 --> 00:10:15,260
So this trie worked by trying of storing
a node for every letter in our words.

250
00:10:15,260 --> 00:10:17,390
And this is very fast for lookup, right?

251
00:10:17,390 --> 00:10:19,910
If you want to figure out
if hello is in this trie,

252
00:10:19,910 --> 00:10:24,980
well, all we have to do is see, OK,
I see H, I see E, I see L, I see L,

253
00:10:24,980 --> 00:10:25,640
and I see O.

254
00:10:25,640 --> 00:10:27,327
Well, hello is in this structure, right?

255
00:10:27,327 --> 00:10:28,410
We know that it's in Here.

256
00:10:28,410 --> 00:10:30,320
So look up is very fast.

257
00:10:30,320 --> 00:10:32,840
And similarly, insertion
is also really quickly.

258
00:10:32,840 --> 00:10:36,860
If I wanted to add Hi to this trie,
well, I could simply go to the H node

259
00:10:36,860 --> 00:10:39,410
and say, OK, well, that
covers the first letter.

260
00:10:39,410 --> 00:10:42,380
Now, I'll add I, maybe somewhere
up above, and that's it.

261
00:10:42,380 --> 00:10:43,280
I've inserted Hi.

262
00:10:43,280 --> 00:10:46,658
So insertion and search are
also really fast in our trie.

263
00:10:46,658 --> 00:10:48,200
But again, what's the trade-off here?

264
00:10:48,200 --> 00:10:49,700
Why wouldn't we always use the trie?

265
00:10:49,700 --> 00:10:53,850


266
00:10:53,850 --> 00:10:54,350
Yeah.

267
00:10:54,350 --> 00:10:55,910
So I'm seeing this idea of memory again.

268
00:10:55,910 --> 00:10:57,980
Like, this is going to
take up a lot more memory

269
00:10:57,980 --> 00:11:00,770
to store all the combinations
of all the letters

270
00:11:00,770 --> 00:11:03,860
that could make up all the words
we might want to have as our wake

271
00:11:03,860 --> 00:11:05,810
word for our assistant, right?

272
00:11:05,810 --> 00:11:08,062
And even beyond that, beyond
the computer's memory,

273
00:11:08,062 --> 00:11:09,770
if you think about
the time it would take

274
00:11:09,770 --> 00:11:12,620
to make a trie, a more
complex data structure,

275
00:11:12,620 --> 00:11:15,170
well, that's real human time
that you have to work on this,

276
00:11:15,170 --> 00:11:18,120
and might actually delay your
shipment of some new feature, right?

277
00:11:18,120 --> 00:11:18,620
And

278
00:11:18,620 --> 00:11:21,770
So it's going to matter, not just how
much time it takes for the computer,

279
00:11:21,770 --> 00:11:24,290
but also how much time it takes
you to make the structure,

280
00:11:24,290 --> 00:11:28,670
if you wanted to actually make
some feature for your own program,

281
00:11:28,670 --> 00:11:31,020
or your own company in the end.

282
00:11:31,020 --> 00:11:33,440
So questions on these structures so far?

283
00:11:33,440 --> 00:11:35,720
Happy to take some in the
chat, if you have any.

284
00:11:35,720 --> 00:11:42,530


285
00:11:42,530 --> 00:11:43,835
What questions do you have?

286
00:11:43,835 --> 00:11:48,370


287
00:11:48,370 --> 00:11:50,300
A question about, does
AI use data structures?

288
00:11:50,300 --> 00:11:50,800
Yes.

289
00:11:50,800 --> 00:11:53,980
So a lot of how AI
can work so quickly is

290
00:11:53,980 --> 00:11:57,410
because we choose to organize data
in these really efficient ways.

291
00:11:57,410 --> 00:12:00,310
And so by getting creative with
the ways we actually make our data

292
00:12:00,310 --> 00:12:03,190
structures beyond even just
linked lists and tries and trees,

293
00:12:03,190 --> 00:12:06,247
we can actually make
AI a lot more quickly--

294
00:12:06,247 --> 00:12:08,080
run a lot more quickly
because it has access

295
00:12:08,080 --> 00:12:10,450
to this data structure
laid out in such a way

296
00:12:10,450 --> 00:12:12,380
that it can run quickly in the end.

297
00:12:12,380 --> 00:12:16,175
So, yes, it makes use of a lot of
different data structures, too.

298
00:12:16,175 --> 00:12:18,550
And this question about memory
being so cheap these days.

299
00:12:18,550 --> 00:12:19,010
It is true.

300
00:12:19,010 --> 00:12:20,302
So memory is very cheap, right?

301
00:12:20,302 --> 00:12:23,200
If we wanted to make
the trie here, maybe we

302
00:12:23,200 --> 00:12:27,280
don't care how much memory it takes
up because memory is so cheap.

303
00:12:27,280 --> 00:12:30,130
But it also takes us time
to allocate that memory,

304
00:12:30,130 --> 00:12:31,840
to put things in that memory.

305
00:12:31,840 --> 00:12:34,030
And so, perhaps, it's
not always the case

306
00:12:34,030 --> 00:12:37,240
that because we have so much memory, we
should just be kind of loose with it.

307
00:12:37,240 --> 00:12:38,440
We really need to take
care that we're not

308
00:12:38,440 --> 00:12:41,050
using too much memory when we
could, theoretically, maybe

309
00:12:41,050 --> 00:12:43,760
use less, if we wanted to.

310
00:12:43,760 --> 00:12:44,800
Nice.

311
00:12:44,800 --> 00:12:45,680
OK.

312
00:12:45,680 --> 00:12:48,410
So this is some of the ideas
of trade-offs here, choosing

313
00:12:48,410 --> 00:12:51,480
one or the other, often going
to cost us one thing or another.

314
00:12:51,480 --> 00:12:53,550
But, really, behind
all of these structures

315
00:12:53,550 --> 00:12:58,100
is the idea of a node, where, if we
look back at our structures here,

316
00:12:58,100 --> 00:13:01,670
we had maybe a trie, we had
a link, we had a hash table,

317
00:13:01,670 --> 00:13:02,810
and we had a linked list.

318
00:13:02,810 --> 00:13:07,520
And it was composing all of these
was this idea of some kind of node,

319
00:13:07,520 --> 00:13:09,110
some place we store data.

320
00:13:09,110 --> 00:13:12,690
And along with that, a pointer
to some other pieces of data.

321
00:13:12,690 --> 00:13:16,100
So we saw last week, in
Week 4, that pointers are

322
00:13:16,100 --> 00:13:18,380
helpful for pointing certain locations.

323
00:13:18,380 --> 00:13:20,900
Well, now, we're seeing the
application of those pointers.

324
00:13:20,900 --> 00:13:23,025
Why would we even want
pointers in the first place?

325
00:13:23,025 --> 00:13:25,860
Because we can build these much
more complex data structures.

326
00:13:25,860 --> 00:13:29,797
So let's think about building these
very first building block for these data

327
00:13:29,797 --> 00:13:31,130
sets, which is called this node.

328
00:13:31,130 --> 00:13:33,130
And in our code that we'll
see in just a minute,

329
00:13:33,130 --> 00:13:36,318
we have this new idea of
creating this node in code.

330
00:13:36,318 --> 00:13:38,360
And we've seen this kind
of syntax before, right?

331
00:13:38,360 --> 00:13:41,480
Here is the code we would use to
make a structure called a node.

332
00:13:41,480 --> 00:13:44,212
And we're seeing that
same syntax for structs,

333
00:13:44,212 --> 00:13:45,920
like, we created a
struct for candidates,

334
00:13:45,920 --> 00:13:47,630
we created a struct for people earlier.

335
00:13:47,630 --> 00:13:50,660
Now, we're using that same
syntax to create a node.

336
00:13:50,660 --> 00:13:52,080
And what do you notice here?

337
00:13:52,080 --> 00:13:53,870
So first, we're going
to look at the top.

338
00:13:53,870 --> 00:13:56,160
And C reads this top to bottom.

339
00:13:56,160 --> 00:13:58,260
So as it reads this code,
it'll do the following.

340
00:13:58,260 --> 00:14:00,890
It'll say, OK, here I'm
seeing we're going to define

341
00:14:00,890 --> 00:14:03,590
a new structure called a node.

342
00:14:03,590 --> 00:14:05,330
That seems OK so far.

343
00:14:05,330 --> 00:14:08,000
And then inside of the
structure called node,

344
00:14:08,000 --> 00:14:10,520
we're going to ensure we have
the following pieces of data.

345
00:14:10,520 --> 00:14:11,990
We're going to make
sure we have a phrase.

346
00:14:11,990 --> 00:14:14,780
If we look at this, we're going
to have space for a phrase that

347
00:14:14,780 --> 00:14:16,260
is going to be a string.

348
00:14:16,260 --> 00:14:21,320
And then we also have some space
for a pointer to a node called next.

349
00:14:21,320 --> 00:14:23,660
And, again, we can store,
like, Hi or bye in here.

350
00:14:23,660 --> 00:14:27,800
But we also have some new space
for next, a pointer to a node.

351
00:14:27,800 --> 00:14:31,970
And that could have a real address
in it, like, 0x123, or 0x456.

352
00:14:31,970 --> 00:14:35,750
But it has to point to
some kind of node, right?

353
00:14:35,750 --> 00:14:39,290
So now inside this node, we
have a phrase and a next portion

354
00:14:39,290 --> 00:14:42,560
currently empty, but we could
put this data inside of it.

355
00:14:42,560 --> 00:14:45,593
And then once we finish reading this,
we're going to say, all of this,

356
00:14:45,593 --> 00:14:48,260
everything we've done so far, the
phrase, the next portions that

357
00:14:48,260 --> 00:14:51,350
are-- all of that to be wrapped
up into this thing called a node.

358
00:14:51,350 --> 00:14:54,230
So we've created kind of here
a template for what a node is.

359
00:14:54,230 --> 00:14:57,650
A node has a phrase in it
and some next pointer in it.

360
00:14:57,650 --> 00:14:59,955
And we're going to call
all of that a node.

361
00:14:59,955 --> 00:15:01,080
Now, you could change this.

362
00:15:01,080 --> 00:15:02,913
You can make sure it
has an integer, you can

363
00:15:02,913 --> 00:15:04,940
make sure it has some
character inside of it.

364
00:15:04,940 --> 00:15:08,270
You could even store more attributes,
if you'd like, inside a single node.

365
00:15:08,270 --> 00:15:11,570
But for now, I'll focus on phrases,
trying to store these activation

366
00:15:11,570 --> 00:15:13,790
words for our voice assistant for now.

367
00:15:13,790 --> 00:15:15,860
Now, questions on this structure so far?

368
00:15:15,860 --> 00:15:19,080


369
00:15:19,080 --> 00:15:23,310
Yes, I'm seeing a question, is the next
location inside the computer's memory?

370
00:15:23,310 --> 00:15:25,410
Is it actually in the byte below?

371
00:15:25,410 --> 00:15:27,960
Like here in our picture
here, we see that the phrase

372
00:15:27,960 --> 00:15:31,770
portion is kind of first, and
then comes the next portion.

373
00:15:31,770 --> 00:15:34,277
Are these two really next
to each other in memory?

374
00:15:34,277 --> 00:15:36,360
And the answer for that
is, generally, like, yeah.

375
00:15:36,360 --> 00:15:38,170
So these are going to be very
close to-- they're in memory.

376
00:15:38,170 --> 00:15:40,337
They're going to be kind
of right next to each other

377
00:15:40,337 --> 00:15:43,290
as we create this node throughout.

378
00:15:43,290 --> 00:15:46,240
And a question about, well,
could we have a before field,

379
00:15:46,240 --> 00:15:47,580
like, a pointer to a before?

380
00:15:47,580 --> 00:15:48,580
And we absolutely could.

381
00:15:48,580 --> 00:15:50,740
So if we wanted to modify
this, we could simply

382
00:15:50,740 --> 00:15:53,620
add a new attribute called
before, and we could theoretically

383
00:15:53,620 --> 00:15:55,570
point to linked lists that are--

384
00:15:55,570 --> 00:15:58,172
to nodes that are behind us,
as well, in our current node.

385
00:15:58,172 --> 00:16:00,880
And that allows us to make more
even more complex data structures

386
00:16:00,880 --> 00:16:01,750
overall, too.

387
00:16:01,750 --> 00:16:05,790


388
00:16:05,790 --> 00:16:08,860
Now, a question here, we see a
struct node if we go back here.

389
00:16:08,860 --> 00:16:10,980
So struct node, star, next.

390
00:16:10,980 --> 00:16:13,972
Is that a pointer to the struct,
or just a simple pointer?

391
00:16:13,972 --> 00:16:15,930
And the answer here is
a pointer to the struct.

392
00:16:15,930 --> 00:16:18,660
So here, we're seeing that
this is a node pointer.

393
00:16:18,660 --> 00:16:20,800
And a node, technically, is a structure.

394
00:16:20,800 --> 00:16:23,010
So we're going to have a
pointer to that node that

395
00:16:23,010 --> 00:16:26,190
is itself a structure combined
with a phrase and another pointer

396
00:16:26,190 --> 00:16:27,340
to another node.

397
00:16:27,340 --> 00:16:30,860
So good questions there.

398
00:16:30,860 --> 00:16:32,780
All right.

399
00:16:32,780 --> 00:16:34,790
Any other questions on
this structure here?

400
00:16:34,790 --> 00:16:41,410


401
00:16:41,410 --> 00:16:42,820
OK.

402
00:16:42,820 --> 00:16:46,750
So that all seems pretty good so
far, and kind of a simple template

403
00:16:46,750 --> 00:16:47,470
for a node.

404
00:16:47,470 --> 00:16:50,050
But keep in mind that just
this code doesn't actually

405
00:16:50,050 --> 00:16:51,150
create the node for us.

406
00:16:51,150 --> 00:16:53,650
It gives us a template for the
node, but we haven't actually

407
00:16:53,650 --> 00:16:55,037
created it in code yet.

408
00:16:55,037 --> 00:16:57,370
So let's actually go ahead
and see how we might do that.

409
00:16:57,370 --> 00:17:00,610
If you were to go ahead and create
our own linked list, let's say.

410
00:17:00,610 --> 00:17:04,000
Like, let's say, out of all these three
structures, we chose a linked list now.

411
00:17:04,000 --> 00:17:06,250
We could start by making
a single node and have

412
00:17:06,250 --> 00:17:08,900
that be at the head of our linked list.

413
00:17:08,900 --> 00:17:11,440
So to create a linked
list, we'll often do this.

414
00:17:11,440 --> 00:17:14,470
We'll often create a
new pointer to a node

415
00:17:14,470 --> 00:17:18,220
and first set it equal to null
because it's pointing to nothing.

416
00:17:18,220 --> 00:17:20,560
There are no nodes in our linked list.

417
00:17:20,560 --> 00:17:23,500
But once we've done that, well, we
probably want to add at least one

418
00:17:23,500 --> 00:17:24,290
node, right?

419
00:17:24,290 --> 00:17:25,540
So let's go ahead and do that.

420
00:17:25,540 --> 00:17:27,790
Let's make a new pointer,
this one called n,

421
00:17:27,790 --> 00:17:30,500
and make sure we have some
space in which to store it.

422
00:17:30,500 --> 00:17:32,200
So let's read this from right to left.

423
00:17:32,200 --> 00:17:33,670
Let's call malloc first.

424
00:17:33,670 --> 00:17:37,160
And malloc will give us some space
in memory to actually put this node.

425
00:17:37,160 --> 00:17:39,250
So you can see at the
visual right below we

426
00:17:39,250 --> 00:17:41,860
have some new space
our program could use

427
00:17:41,860 --> 00:17:46,090
to actually store this node inside of
and add a phrase, or some next pointer,

428
00:17:46,090 --> 00:17:47,140
right?

429
00:17:47,140 --> 00:17:49,480
But now if we go all
the way to the left,

430
00:17:49,480 --> 00:17:53,920
we're going to see this pointer called
n will then point to that space.

431
00:17:53,920 --> 00:17:55,870
And notice how n is a type node star.

432
00:17:55,870 --> 00:18:00,260
It's pointing to some
space for a node, right?

433
00:18:00,260 --> 00:18:00,860
OK.

434
00:18:00,860 --> 00:18:02,783
So that makes our first node.

435
00:18:02,783 --> 00:18:05,450
But what we probably want to do
is actually add some data to it.

436
00:18:05,450 --> 00:18:07,050
Currently, it's empty, right?

437
00:18:07,050 --> 00:18:07,920
So what can we do?

438
00:18:07,920 --> 00:18:10,640
Well, we can actually use this
new syntax, this arrow syntax,

439
00:18:10,640 --> 00:18:12,780
to actually add in some data.

440
00:18:12,780 --> 00:18:18,050
So down below, we could say, n
arrow phrase gets the value high.

441
00:18:18,050 --> 00:18:19,050
And what does that mean?

442
00:18:19,050 --> 00:18:21,200
So if you look at our
visual here, we could see--

443
00:18:21,200 --> 00:18:23,960
OK, we see n, and we see an arrow.

444
00:18:23,960 --> 00:18:27,470
And now, we're going to go to the phrase
portion of this new piece of memory

445
00:18:27,470 --> 00:18:31,290
we've allocated and make sure
we put the phrase Hi in there.

446
00:18:31,290 --> 00:18:34,400
So n arrow phrase is saying,
start with that pointer

447
00:18:34,400 --> 00:18:36,950
and follow it to the
structure that you have,

448
00:18:36,950 --> 00:18:39,650
go to that phrase attribute,
that phrase portion,

449
00:18:39,650 --> 00:18:43,160
and make sure that says
Hi in the middle, right?

450
00:18:43,160 --> 00:18:45,320
And that's all fine and
good, but now, we also

451
00:18:45,320 --> 00:18:47,012
have to include the next portion.

452
00:18:47,012 --> 00:18:49,470
Like, currently it's empty,
but we want to have some value.

453
00:18:49,470 --> 00:18:52,430
And if this is our first
node, what's the logical value

454
00:18:52,430 --> 00:18:56,360
for this next attribute for our node?

455
00:18:56,360 --> 00:18:58,010
This is our very first node.

456
00:18:58,010 --> 00:18:59,120
What could we set it to?

457
00:18:59,120 --> 00:19:01,680


458
00:19:01,680 --> 00:19:02,180
All right.

459
00:19:02,180 --> 00:19:03,472
So I'm seeing null in the chat.

460
00:19:03,472 --> 00:19:08,240
And the correct version is the
N-U-L-L, not N-U-L. N-U-L is nul,

461
00:19:08,240 --> 00:19:10,490
and the concept of character
is like a nul character.

462
00:19:10,490 --> 00:19:13,460
But N-U-L-L is null in
the context of an address.

463
00:19:13,460 --> 00:19:16,530
So this is an empty
address, zero address here.

464
00:19:16,530 --> 00:19:20,630
So if we say n arrow next gets the value
null, we'd see something like this.

465
00:19:20,630 --> 00:19:25,020
Here, we have our new node n, it's
phrase is Hi, its next pointer is null.

466
00:19:25,020 --> 00:19:28,160
It's the very first node in our list.

467
00:19:28,160 --> 00:19:34,130
So any questions on this so far,
actually, making up our first node?

468
00:19:34,130 --> 00:19:37,280
To your question on the arrow
syntax, so again, the arrow syntax

469
00:19:37,280 --> 00:19:40,680
is kind of best visualized by
actually reading from left to right.

470
00:19:40,680 --> 00:19:42,740
So if we look at the very
first thing, we see n.

471
00:19:42,740 --> 00:19:44,430
OK. n is a pointer.

472
00:19:44,430 --> 00:19:48,380
Now, the arrow says, let's follow
that pointer to our structure, right,

473
00:19:48,380 --> 00:19:49,980
because n is pointing to a structure.

474
00:19:49,980 --> 00:19:52,550
And let's go to the next
field, that next attribute,

475
00:19:52,550 --> 00:19:55,710
and set it equal to some value
we put on the right-hand side.

476
00:19:55,710 --> 00:20:00,050
So here we said, start with n, follow
n's pointer to our node structure,

477
00:20:00,050 --> 00:20:02,900
and go to that next attribute
and set that equal to null,

478
00:20:02,900 --> 00:20:05,950
as we see down below here.

479
00:20:05,950 --> 00:20:06,940
OK.

480
00:20:06,940 --> 00:20:09,400
So there's one more
step, though, which is

481
00:20:09,400 --> 00:20:11,757
our list is still looking pretty empty.

482
00:20:11,757 --> 00:20:14,090
We have our node, but our
list is still empty over here.

483
00:20:14,090 --> 00:20:17,500
So what could we do to actually
add this node to our linked list?

484
00:20:17,500 --> 00:20:21,610


485
00:20:21,610 --> 00:20:28,445
What kind of code do we need here, or
where do we want list to point now?

486
00:20:28,445 --> 00:20:32,908


487
00:20:32,908 --> 00:20:35,450
Feel free to type in the chat
if you think you might have it.

488
00:20:35,450 --> 00:20:39,280


489
00:20:39,280 --> 00:20:41,620
So I'm seeing two
competing opinions here.

490
00:20:41,620 --> 00:20:44,910
So one opinion is that we
should say list equals n,

491
00:20:44,910 --> 00:20:47,010
which would be list gets the value n.

492
00:20:47,010 --> 00:20:51,040
And I'm seeing n equals list,
or n gets the value list.

493
00:20:51,040 --> 00:20:53,220
Which one do we think might work here?

494
00:20:53,220 --> 00:20:54,120
Let's try one.

495
00:20:54,120 --> 00:20:57,120
So let's say, first, that
list would get the value n.

496
00:20:57,120 --> 00:20:59,680
So list equals n, and let's try that.

497
00:20:59,680 --> 00:21:01,840
So here we have that up
here and let's execute it.

498
00:21:01,840 --> 00:21:06,720
And now we'll see, OK, we'll list
an n point to the same thing.

499
00:21:06,720 --> 00:21:08,940
Because n is a pointer
and list is a pointer.

500
00:21:08,940 --> 00:21:12,160
If we set list equal to n, though,
both point to the very same thing,

501
00:21:12,160 --> 00:21:13,703
which is our very first node.

502
00:21:13,703 --> 00:21:15,870
What would happen if we did
it the other way around?

503
00:21:15,870 --> 00:21:21,510
If we said n equals list, instead of
list equals n, where would n point to?

504
00:21:21,510 --> 00:21:25,920


505
00:21:25,920 --> 00:21:27,877
Any ideas in the chat?

506
00:21:27,877 --> 00:21:28,960
So I'm seeing null, right?

507
00:21:28,960 --> 00:21:32,050
If list is currently null, it's
pointing to really nothing in general,

508
00:21:32,050 --> 00:21:34,480
well, if we say n
equals list, we're going

509
00:21:34,480 --> 00:21:38,680
to make sure that end points to null
and we'll have lost this node over here

510
00:21:38,680 --> 00:21:40,060
that we just had, right?

511
00:21:40,060 --> 00:21:41,948
So we probably don't want to do that.

512
00:21:41,948 --> 00:21:43,990
So this will be our last
step in adding our node.

513
00:21:43,990 --> 00:21:47,230
And now we have our
very first node inside

514
00:21:47,230 --> 00:21:50,070
of our linked list, which is great.

515
00:21:50,070 --> 00:21:52,575
Questions on this here,
too, before we keep going?

516
00:21:52,575 --> 00:21:59,170


517
00:21:59,170 --> 00:22:01,660
OK.

518
00:22:01,660 --> 00:22:04,820
Not seeing too many here, so
why don't we keep on going.

519
00:22:04,820 --> 00:22:08,410
And if we want to kick off the very
first node, this is how we would do it.

520
00:22:08,410 --> 00:22:11,770
But let's say we wanted to actually
keep adding more and more nodes.

521
00:22:11,770 --> 00:22:14,500
We could start thinking about
inserting nodes to our linked list.

522
00:22:14,500 --> 00:22:16,910
And there are a few ways
we could do even that.

523
00:22:16,910 --> 00:22:19,517
So if we think back to our
picture as it is right now,

524
00:22:19,517 --> 00:22:21,850
let's say we wanted to create
this new space for a node.

525
00:22:21,850 --> 00:22:24,850
We could say, OK, I'm going to
have this new node called n,

526
00:22:24,850 --> 00:22:27,370
and I'm going to use malloc
to give me some space for it.

527
00:22:27,370 --> 00:22:28,610
So same thing we did before.

528
00:22:28,610 --> 00:22:32,200
We're going to say, get some
space, make sure n points to it.

529
00:22:32,200 --> 00:22:37,360
And now, how could we best
actually add this node to our list?

530
00:22:37,360 --> 00:22:40,345
Would we add it at the end maybe, or
the beginning, which do you think?

531
00:22:40,345 --> 00:22:43,843


532
00:22:43,843 --> 00:22:46,760
Could we add n at the beginning or
the end, and which would be faster?

533
00:22:46,760 --> 00:22:50,337


534
00:22:50,337 --> 00:22:51,920
So I'm seeing some competing opinions.

535
00:22:51,920 --> 00:22:54,900
I'm seeing end and I'm seeing beginning.

536
00:22:54,900 --> 00:22:57,270
Now, if you think about
a longer linked list,

537
00:22:57,270 --> 00:23:01,922
let's say we had not one node
but really like 50 nodes.

538
00:23:01,922 --> 00:23:03,380
Would we want to add it at the end?

539
00:23:03,380 --> 00:23:05,713
And if so, how many steps
would it take to add this node

540
00:23:05,713 --> 00:23:07,770
at the end of that linked list?

541
00:23:07,770 --> 00:23:10,470
If we have 50 nodes,
how many steps would it

542
00:23:10,470 --> 00:23:12,180
take to add this node at the end?

543
00:23:12,180 --> 00:23:13,185
It would take 50, right?

544
00:23:13,185 --> 00:23:15,810
And that reason is that you have
to actually find the last node

545
00:23:15,810 --> 00:23:16,780
and then add it there.

546
00:23:16,780 --> 00:23:20,190
So to find that last node, we have
to go through all 50 of our nodes.

547
00:23:20,190 --> 00:23:22,440
So if we want the very quick
insertion we talked about

548
00:23:22,440 --> 00:23:24,398
before, we have to add
a node at the beginning,

549
00:23:24,398 --> 00:23:26,310
right, make sure that
this new node points

550
00:23:26,310 --> 00:23:29,920
to the current first node in our list.

551
00:23:29,920 --> 00:23:31,237
So let's actually do that here.

552
00:23:31,237 --> 00:23:33,820
We have some space for a node,
we have a pointer to that node.

553
00:23:33,820 --> 00:23:35,590
But let's see how we might add it in.

554
00:23:35,590 --> 00:23:38,550
So we probably first give
this node a phrase, like, Hey.

555
00:23:38,550 --> 00:23:41,610
We'll say, OK, n arrow
phrase gets the value Hey.

556
00:23:41,610 --> 00:23:47,940
And now, we want to say, hum, where
should we point n's next pointer to?

557
00:23:47,940 --> 00:23:53,590
If we look at this picture here, where
should we point n's next pointer?

558
00:23:53,590 --> 00:23:56,270
In the chat, if you will.

559
00:23:56,270 --> 00:23:57,110
Yeah, to list.

560
00:23:57,110 --> 00:23:59,840
So we want to make sure
that this new node called

561
00:23:59,840 --> 00:24:03,690
n points to the very first node in our
list, that being called, well, just

562
00:24:03,690 --> 00:24:04,190
list.

563
00:24:04,190 --> 00:24:07,950
So here we'll say, OK, n arrow
next gets the value list.

564
00:24:07,950 --> 00:24:11,150
So now we're saying that n
is a node, it has the phrase,

565
00:24:11,150 --> 00:24:14,630
Hey, and its next arrow
is pointing to this list

566
00:24:14,630 --> 00:24:17,760
node, the very head of our list here.

567
00:24:17,760 --> 00:24:18,260
All right.

568
00:24:18,260 --> 00:24:21,770
But now, you'll notice that our
linked list is not quite in the state

569
00:24:21,770 --> 00:24:24,660
we want it to be in, like, list is
still pointing to our second node.

570
00:24:24,660 --> 00:24:26,368
So what's the next
step here, if you want

571
00:24:26,368 --> 00:24:30,550
to make sure our list now
includes this very first node?

572
00:24:30,550 --> 00:24:32,000
What should we do?

573
00:24:32,000 --> 00:24:34,400
We should point list to n,
as I'm seeing in the chat.

574
00:24:34,400 --> 00:24:37,232
So let's make sure we have
list pointing to n now.

575
00:24:37,232 --> 00:24:37,940
So we'll do this.

576
00:24:37,940 --> 00:24:40,450
We'll say, list equals n, and
we'll run this and we'll see,

577
00:24:40,450 --> 00:24:45,460
OK, list gets the value n, meaning
list now points to this new first node.

578
00:24:45,460 --> 00:24:49,150
And now we have our own linked
list of more than one node, right?

579
00:24:49,150 --> 00:24:50,440
We went from one to two.

580
00:24:50,440 --> 00:24:53,148
And now, I could even do the same
thing, three, or four, or five,

581
00:24:53,148 --> 00:24:55,900
even 500 times, if we wanted to.

582
00:24:55,900 --> 00:24:58,440
So questions on this,
inserting this new node here?

583
00:24:58,440 --> 00:25:06,850


584
00:25:06,850 --> 00:25:08,990
OK.

585
00:25:08,990 --> 00:25:11,090
So not seeing too many here.

586
00:25:11,090 --> 00:25:13,200
Let's actually try to
translate this bit to code.

587
00:25:13,200 --> 00:25:15,075
So this is all fine and
good with our visual,

588
00:25:15,075 --> 00:25:16,770
but let's actually try it out in code.

589
00:25:16,770 --> 00:25:18,950
I'll go over to my Visual
Studio code over here.

590
00:25:18,950 --> 00:25:22,790
And here, I have my terminal,
a file already called list.c.

591
00:25:22,790 --> 00:25:25,097
So I'll code up list.c.

592
00:25:25,097 --> 00:25:27,180
And if we take a look at
this file, what we'll see

593
00:25:27,180 --> 00:25:29,960
is a very basic kind
of linked list file.

594
00:25:29,960 --> 00:25:32,000
Here at the top, I have my headers.

595
00:25:32,000 --> 00:25:34,785
And down below, I have my
structure called a node.

596
00:25:34,785 --> 00:25:36,660
And this is the same as
we saw before, right?

597
00:25:36,660 --> 00:25:39,950
We have a new structure called a
node that's going to hold a phrase,

598
00:25:39,950 --> 00:25:41,510
and it's going to have a
pointer to the next node.

599
00:25:41,510 --> 00:25:44,802
And we're going to call all of this node
throughout our code using this typedef

600
00:25:44,802 --> 00:25:46,280
syntax up here.

601
00:25:46,280 --> 00:25:49,610
Now, down below, we have some
functions to actually implement.

602
00:25:49,610 --> 00:25:51,920
One is called unload,
that we'll see in a bit,

603
00:25:51,920 --> 00:25:54,170
to actually make sure we're
not using too much memory,

604
00:25:54,170 --> 00:25:55,820
we're actually freeing it as we go.

605
00:25:55,820 --> 00:25:59,690
And the other down here in main is
actually add items or add phrases

606
00:25:59,690 --> 00:26:01,400
to our linked list.

607
00:26:01,400 --> 00:26:03,710
And as we do, we have this
function down here called

608
00:26:03,710 --> 00:26:06,230
visualize that I wrote
myself and should actually

609
00:26:06,230 --> 00:26:08,480
help you see what's happening
inside of our link list.

610
00:26:08,480 --> 00:26:10,897
Don't worry about all this
kind of scary syntax down here.

611
00:26:10,897 --> 00:26:13,350
We'll see what happens as we
add nodes to our linked list.

612
00:26:13,350 --> 00:26:16,450
So currently, if we wanted to add
some phrases to our linked list,

613
00:26:16,450 --> 00:26:19,200
well, we could just kind of run
this program and see what happens.

614
00:26:19,200 --> 00:26:20,210
Well, first, I could--

615
00:26:20,210 --> 00:26:21,890
let me bring up my terminal here.

616
00:26:21,890 --> 00:26:27,715
And I could say, I want
to make list like this.

617
00:26:27,715 --> 00:26:29,590
And let's go ahead and
make this full screen.

618
00:26:29,590 --> 00:26:31,780
I'll say ./list to run it.

619
00:26:31,780 --> 00:26:33,270
And now, what phrase should we add?

620
00:26:33,270 --> 00:26:35,270
Maybe we add Hi, exclamation point.

621
00:26:35,270 --> 00:26:36,390
So I'll hit Enter.

622
00:26:36,390 --> 00:26:39,480
And it doesn't show anything, right?

623
00:26:39,480 --> 00:26:40,740
Nothing's here yet.

624
00:26:40,740 --> 00:26:41,910
What if I did Hello!

625
00:26:41,910 --> 00:26:43,110
I could say Hello!

626
00:26:43,110 --> 00:26:44,800
And nothing's there.

627
00:26:44,800 --> 00:26:45,300
OK.

628
00:26:45,300 --> 00:26:47,550
But why is nothing there yet?

629
00:26:47,550 --> 00:26:49,905
In the chat, why is nothing here?

630
00:26:49,905 --> 00:26:53,540


631
00:26:53,540 --> 00:26:54,750
Yeah, we didn't add anything.

632
00:26:54,750 --> 00:26:58,040
So if you go back to our
program here, we still

633
00:26:58,040 --> 00:27:01,820
have a to do, to add phrases
to a new node in this list.

634
00:27:01,820 --> 00:27:05,210
So here, if we look back,
very familiar syntax,

635
00:27:05,210 --> 00:27:07,190
we're creating our
very first linked list.

636
00:27:07,190 --> 00:27:09,200
And initially, this list is null.

637
00:27:09,200 --> 00:27:10,170
There's nothing in it.

638
00:27:10,170 --> 00:27:12,140
So that's why, when we
ran this program before,

639
00:27:12,140 --> 00:27:14,990
we didn't see anything
inside of our linked list.

640
00:27:14,990 --> 00:27:19,310
But now, what we're going to
do is iterate list size times

641
00:27:19,310 --> 00:27:23,450
to ask the user for some phrase
and add it to our linked list

642
00:27:23,450 --> 00:27:25,760
and visualize that addition.

643
00:27:25,760 --> 00:27:27,560
So how big is list size?

644
00:27:27,560 --> 00:27:28,940
Well, list size is 2.

645
00:27:28,940 --> 00:27:30,530
We'll iterate two times.

646
00:27:30,530 --> 00:27:34,110
And each time, we'll go ahead
and add some new phrase, right?

647
00:27:34,110 --> 00:27:37,340
So let's think first-- how are
we going to make this new node?

648
00:27:37,340 --> 00:27:40,020
We first probably want to
actually create some space for it.

649
00:27:40,020 --> 00:27:43,300
So how could we do that here?

650
00:27:43,300 --> 00:27:45,460
We're trying to add a new node now.

651
00:27:45,460 --> 00:27:47,230
So we're going to need malloc, OK.

652
00:27:47,230 --> 00:27:49,120
So we're going to need
malloc in some form.

653
00:27:49,120 --> 00:27:52,510
And how much space should
we ask for, do we know?

654
00:27:52,510 --> 00:27:56,000


655
00:27:56,000 --> 00:27:58,750
So we want to create
some space for a node.

656
00:27:58,750 --> 00:28:00,685
And maybe a node is--

657
00:28:00,685 --> 00:28:03,130
I don't know-- I could guess
and say it's five bytes.

658
00:28:03,130 --> 00:28:04,780
But is that right?

659
00:28:04,780 --> 00:28:05,495
Or maybe it's 10.

660
00:28:05,495 --> 00:28:06,370
I mean, I don't know.

661
00:28:06,370 --> 00:28:07,700
It could be anything.

662
00:28:07,700 --> 00:28:10,360
We don't really quite
know, and so that's

663
00:28:10,360 --> 00:28:13,390
why we want to use this
function called sizeof.

664
00:28:13,390 --> 00:28:16,690
And we can pass sizeof a
certain type, like, node, right?

665
00:28:16,690 --> 00:28:19,530
I don't know how big a
node is, but sizeof does.

666
00:28:19,530 --> 00:28:25,162
It'll tell malloc how much memory to
create to have this node in place.

667
00:28:25,162 --> 00:28:27,370
But now that I've asked for
memory, I should probably

668
00:28:27,370 --> 00:28:29,650
keep track where that memory is.

669
00:28:29,650 --> 00:28:34,360
So I need some variable
to store that location

670
00:28:34,360 --> 00:28:37,260
memory, which might be called what?

671
00:28:37,260 --> 00:28:41,020
What could we store here?

672
00:28:41,020 --> 00:28:44,470
Maybe we could have a node pointer.

673
00:28:44,470 --> 00:28:45,490
Let's call it n.

674
00:28:45,490 --> 00:28:49,840
So we could say, OK, this is
going to point to some variable--

675
00:28:49,840 --> 00:28:52,630
point to some new piece
of memory called n.

676
00:28:52,630 --> 00:28:55,360
And we're going to make sure
it points to a node, right?

677
00:28:55,360 --> 00:28:59,060
This is going to be some new
chunk of memory where n points to.

678
00:28:59,060 --> 00:28:59,560
All right.

679
00:28:59,560 --> 00:29:02,780
So we do that kind of first bit, and
now we have some space for our node.

680
00:29:02,780 --> 00:29:04,480
But what do we need now?

681
00:29:04,480 --> 00:29:07,478
We need to make sure
that we add in a phrase.

682
00:29:07,478 --> 00:29:08,395
So how can we do that?

683
00:29:08,395 --> 00:29:12,323


684
00:29:12,323 --> 00:29:14,740
We would probably use our arrow
syntax like we saw before.

685
00:29:14,740 --> 00:29:17,700
So we could say, n arrow phrase.

686
00:29:17,700 --> 00:29:20,370
And let's store the phrase
that we have right now inside.

687
00:29:20,370 --> 00:29:23,500
And similarly for the next
portion, we could do this.

688
00:29:23,500 --> 00:29:26,590
We could say n arrow
next gets the value.

689
00:29:26,590 --> 00:29:33,330
But what should it get, if
this is our maybe first node?

690
00:29:33,330 --> 00:29:36,760
I'm seeing a few different ideas here.

691
00:29:36,760 --> 00:29:38,520
So I'm seeing one for null.

692
00:29:38,520 --> 00:29:39,190
Let's try that.

693
00:29:39,190 --> 00:29:41,830
So we'll say, OK, n arrow next is null.

694
00:29:41,830 --> 00:29:42,810
It's our first node.

695
00:29:42,810 --> 00:29:45,330
And now what we want to
do is make sure that list

696
00:29:45,330 --> 00:29:48,910
points to that very first
node we just made up above.

697
00:29:48,910 --> 00:29:49,890
So let's try it.

698
00:29:49,890 --> 00:29:51,275
Let's go ahead and compile this.

699
00:29:51,275 --> 00:29:52,650
We'll open up our terminal again.

700
00:29:52,650 --> 00:29:55,160
We'll say, make list and--

701
00:29:55,160 --> 00:29:59,590
oops-- we forgot our
semicolon, so we'll do that.

702
00:29:59,590 --> 00:30:03,080
And we'll go back here
and we'll say, make list.

703
00:30:03,080 --> 00:30:05,030
And then we'll do ./list.

704
00:30:05,030 --> 00:30:07,960
And I'll say, OK, my first phrase is Hi!

705
00:30:07,960 --> 00:30:11,620
And that seems all right, like, I
have this new node at this location.

706
00:30:11,620 --> 00:30:13,180
It's phrase is Hi!

707
00:30:13,180 --> 00:30:14,900
And its next pointer is null.

708
00:30:14,900 --> 00:30:16,120
That seems OK.

709
00:30:16,120 --> 00:30:17,060
What if I did this?

710
00:30:17,060 --> 00:30:18,460
What if I said Hello!

711
00:30:18,460 --> 00:30:20,650
now.

712
00:30:20,650 --> 00:30:26,470
And I'm seeming to actually
be missing my very first node.

713
00:30:26,470 --> 00:30:27,730
It kind of went away.

714
00:30:27,730 --> 00:30:30,340
What happened here?

715
00:30:30,340 --> 00:30:31,930
Any ideas in the chat?

716
00:30:31,930 --> 00:30:37,490


717
00:30:37,490 --> 00:30:37,990
Yeah.

718
00:30:37,990 --> 00:30:40,390
So I'm seeing we didn't
quite add this node.

719
00:30:40,390 --> 00:30:43,840
Because if we take a
look at our code, we're

720
00:30:43,840 --> 00:30:47,380
always setting the next pointer
for our new node to be null,

721
00:30:47,380 --> 00:30:48,440
and that means nothing.

722
00:30:48,440 --> 00:30:50,410
So whenever we add a new
node, well, we're just

723
00:30:50,410 --> 00:30:52,210
going to make sure that
the next pointer is null,

724
00:30:52,210 --> 00:30:53,110
like, it doesn't point to anything?

725
00:30:53,110 --> 00:30:54,068
That isn't quite right.

726
00:30:54,068 --> 00:30:58,570
So let's make sure we actually update
this to point to the head of our list.

727
00:30:58,570 --> 00:31:02,840
Our first node should point to whatever
is the first element in our list.

728
00:31:02,840 --> 00:31:06,850
So let's say this, n arrow
next gets the value list.

729
00:31:06,850 --> 00:31:09,130
And how will this work at the beginning?

730
00:31:09,130 --> 00:31:12,490
So notice how list is
null at the beginning.

731
00:31:12,490 --> 00:31:14,770
Then we'll go through
and add our first node.

732
00:31:14,770 --> 00:31:17,290
We'll say, OK, let's
get a phrase, and let's

733
00:31:17,290 --> 00:31:20,320
make sure we have some space
for this new node called n.

734
00:31:20,320 --> 00:31:23,860
Let's make sure this n has the
phrase that we've been typing in.

735
00:31:23,860 --> 00:31:28,060
And now, n arrow next is equal to
list which, in fact, is null, right?

736
00:31:28,060 --> 00:31:30,880
So our very first node
will still have null

737
00:31:30,880 --> 00:31:34,480
as its sort of beginning next pointer.

738
00:31:34,480 --> 00:31:36,640
And then we'll go ahead
and make sure list

739
00:31:36,640 --> 00:31:40,420
is updated to have the very
first node at the head.

740
00:31:40,420 --> 00:31:43,060
But then if we do it again,
if we go through again,

741
00:31:43,060 --> 00:31:46,120
we'll say that, OK, this list
pointer's next is actually pointing

742
00:31:46,120 --> 00:31:48,163
to the previous node we just created.

743
00:31:48,163 --> 00:31:50,830
So now I can actually have the
string together as we go through.

744
00:31:50,830 --> 00:31:52,038
And let's visualize this now.

745
00:31:52,038 --> 00:31:53,600
So let's go back to our terminal.

746
00:31:53,600 --> 00:31:58,110
Let's say, make list,
and let's do ./list.

747
00:31:58,110 --> 00:31:59,580
And we could say Hi!

748
00:31:59,580 --> 00:32:01,290
And then we could say Hello!

749
00:32:01,290 --> 00:32:05,730
And now we see our list being built.
So notice how our very first node here

750
00:32:05,730 --> 00:32:06,990
is at this location.

751
00:32:06,990 --> 00:32:08,520
It has the phrase Hello!

752
00:32:08,520 --> 00:32:11,860
And the next pointer is this, it's
pointing to some new location here.

753
00:32:11,860 --> 00:32:15,990
But notice how this is now
in our next node, 56f0.

754
00:32:15,990 --> 00:32:17,580
And the phrase here is Hi!

755
00:32:17,580 --> 00:32:21,170
And the next pointer is null here.

756
00:32:21,170 --> 00:32:22,590
OK.

757
00:32:22,590 --> 00:32:23,840
So that seemed to work for us.

758
00:32:23,840 --> 00:32:25,490
Let's go back to our code.

759
00:32:25,490 --> 00:32:27,785
What questions are there on this?

760
00:32:27,785 --> 00:32:35,113


761
00:32:35,113 --> 00:32:35,780
A good question.

762
00:32:35,780 --> 00:32:37,660
So is this some kind of recursion?

763
00:32:37,660 --> 00:32:40,900
This isn't quite recursion
because, again, recursion

764
00:32:40,900 --> 00:32:43,640
involves having a function call itself.

765
00:32:43,640 --> 00:32:45,470
And here, we're really inside of a loop.

766
00:32:45,470 --> 00:32:47,230
So it isn't quite recursion.

767
00:32:47,230 --> 00:32:51,160
But you're onto an idea, which
is that every node that we add

768
00:32:51,160 --> 00:32:52,865
is going through that same process.

769
00:32:52,865 --> 00:32:55,240
So this really could potentially
be a recursive function,

770
00:32:55,240 --> 00:32:57,050
if you wanted it to be.

771
00:32:57,050 --> 00:33:00,010
But here, you're just noticing that
the same thing we're doing for one

772
00:33:00,010 --> 00:33:02,710
node, we're doing for another,
as we add it, as we go through.

773
00:33:02,710 --> 00:33:04,977


774
00:33:04,977 --> 00:33:06,310
And that's a good question, too.

775
00:33:06,310 --> 00:33:09,010
Is the last node we
add becoming the first?

776
00:33:09,010 --> 00:33:10,100
That is also true.

777
00:33:10,100 --> 00:33:11,980
So here if we were to say, Hi!

778
00:33:11,980 --> 00:33:12,910
And then Hello!

779
00:33:12,910 --> 00:33:15,490
Well, Hello! becomes our
new first node, right?

780
00:33:15,490 --> 00:33:18,070
That's because we're trying to
insert new nodes at the front

781
00:33:18,070 --> 00:33:21,610
where the first node we
add becomes the first--

782
00:33:21,610 --> 00:33:25,510
sorry-- the last node we add becomes
the first node in our list, ultimately.

783
00:33:25,510 --> 00:33:28,370


784
00:33:28,370 --> 00:33:30,340
OK.

785
00:33:30,340 --> 00:33:33,010
So there is one final
thing to do here, which

786
00:33:33,010 --> 00:33:35,320
is to make sure that if
we're asking for memory,

787
00:33:35,320 --> 00:33:38,630
we do want to make sure that this
memory is actually given to us.

788
00:33:38,630 --> 00:33:41,680
And before we go ahead and
use this pointer called n,

789
00:33:41,680 --> 00:33:44,540
we do want to make sure that we're
not getting back some null value.

790
00:33:44,540 --> 00:33:47,500
So if n is null, what
we want to do is simply

791
00:33:47,500 --> 00:33:51,820
print something like couldn't
allocate memory for node,

792
00:33:51,820 --> 00:33:54,670
or whatever you want to type
here, some descriptive error.

793
00:33:54,670 --> 00:33:59,560
And then we could return 1 to symbolize
that something went wrong here.

794
00:33:59,560 --> 00:34:00,060
OK.

795
00:34:00,060 --> 00:34:02,890


796
00:34:02,890 --> 00:34:05,020
A question then, is this a stack?

797
00:34:05,020 --> 00:34:09,400
Well, it has the beginnings of
a stack where if you remember,

798
00:34:09,400 --> 00:34:12,790
a stack is this data structure
where the first thing we add

799
00:34:12,790 --> 00:34:14,920
becomes the last thing we get out.

800
00:34:14,920 --> 00:34:18,982
Or otherwise, the last thing we
add is the first thing we get out,

801
00:34:18,982 --> 00:34:19,940
which is the case here.

802
00:34:19,940 --> 00:34:23,170
So we're saying that the last
node, the last phrase we add,

803
00:34:23,170 --> 00:34:26,920
is going to be the first
node we see in our structure.

804
00:34:26,920 --> 00:34:29,710
But it also depends on how
we access this linked list.

805
00:34:29,710 --> 00:34:32,170
If we always access it
from the very first node,

806
00:34:32,170 --> 00:34:34,929
well, that could be kind
of similar to a stack.

807
00:34:34,929 --> 00:34:38,230
But if we always want to access it by
looking at the very last element first,

808
00:34:38,230 --> 00:34:40,150
well, that could be then a queue, right?

809
00:34:40,150 --> 00:34:42,820
So a queue in a stack kind of
depends on how you actually

810
00:34:42,820 --> 00:34:46,075
use them, and not so much what you
build kind of underneath the hood.

811
00:34:46,075 --> 00:34:49,875


812
00:34:49,875 --> 00:34:50,375
All right.

813
00:34:50,375 --> 00:34:53,760


814
00:34:53,760 --> 00:34:55,920
So let's keep going here.

815
00:34:55,920 --> 00:34:58,880
And I would say that this
code is mostly correct.

816
00:34:58,880 --> 00:35:02,420
But if I go down to my
terminal and I now do this,

817
00:35:02,420 --> 00:35:04,760
I do make list again just to compile it.

818
00:35:04,760 --> 00:35:07,045
And then I say, let's
try to memory check this

819
00:35:07,045 --> 00:35:08,670
and see if there are any memory errors.

820
00:35:08,670 --> 00:35:11,330
Well, I could do valgrind./list.

821
00:35:11,330 --> 00:35:16,310
And let me type in, let's say,
once it loads, I can type in Hi!

822
00:35:16,310 --> 00:35:17,630
Right?

823
00:35:17,630 --> 00:35:19,190
And now that seems OK.

824
00:35:19,190 --> 00:35:20,930
I can type in Hello!

825
00:35:20,930 --> 00:35:25,250
And my program ends, but it seems
I've definitely lost some bytes.

826
00:35:25,250 --> 00:35:27,290
I've definitely lost 16 bytes.

827
00:35:27,290 --> 00:35:30,110
And I've indirectly lost 16 bytes.

828
00:35:30,110 --> 00:35:31,680
So what happened here?

829
00:35:31,680 --> 00:35:33,680
What did I fail to do in my code?

830
00:35:33,680 --> 00:35:36,128


831
00:35:36,128 --> 00:35:37,420
Yeah, I didn't free the memory.

832
00:35:37,420 --> 00:35:42,570
So if I look back at my program, if I
do this and I say, OK, well, unload.

833
00:35:42,570 --> 00:35:45,240
I trust that unload
will work for me, right?

834
00:35:45,240 --> 00:35:47,550
But if I actually look down
here, it's still a to do.

835
00:35:47,550 --> 00:35:50,290
I didn't actually free any nodes here.

836
00:35:50,290 --> 00:35:53,400
And if I look above, well,
I used malloc several times.

837
00:35:53,400 --> 00:35:54,420
I used it twice.

838
00:35:54,420 --> 00:35:57,430
And I didn't call free
any of those times.

839
00:35:57,430 --> 00:36:00,280
So every time we call malloc, we
want to make sure we call free,

840
00:36:00,280 --> 00:36:02,613
and that's what unload is
going to do for us down below.

841
00:36:02,613 --> 00:36:06,090
We're making our own function that
takes in some pointer to a list

842
00:36:06,090 --> 00:36:09,120
and should go through and
free every node in that list.

843
00:36:09,120 --> 00:36:10,995
We can give back that
memory to the computer.

844
00:36:10,995 --> 00:36:13,370
So let's visualize, if we're
actually writing it in code.

845
00:36:13,370 --> 00:36:15,570
If we go back to our
slides here, we could

846
00:36:15,570 --> 00:36:19,200
think about actually trying
to free this linked list.

847
00:36:19,200 --> 00:36:21,610
And maybe your initial
thought is, let's do this.

848
00:36:21,610 --> 00:36:23,027
Let's actually just free the list.

849
00:36:23,027 --> 00:36:23,970
Let's say this.

850
00:36:23,970 --> 00:36:25,740
But why would this be bad?

851
00:36:25,740 --> 00:36:30,300
If we say free the
list, what goes wrong?

852
00:36:30,300 --> 00:36:32,370
Here's the original picture.

853
00:36:32,370 --> 00:36:34,110
Here's freeing the list.

854
00:36:34,110 --> 00:36:35,640
What happened here?

855
00:36:35,640 --> 00:36:40,010


856
00:36:40,010 --> 00:36:42,140
Yes, I'm saying we lost the other nodes.

857
00:36:42,140 --> 00:36:47,410
So notice how in this picture, if you
go back, well, the very first node

858
00:36:47,410 --> 00:36:49,570
is still pointing to that next node.

859
00:36:49,570 --> 00:36:52,510
And the only reason we know
where that next node is located

860
00:36:52,510 --> 00:36:55,690
is because it's in the next
portion of our very first node.

861
00:36:55,690 --> 00:37:00,940
But if we free that, if we get rid of
it, well, where is that second node?

862
00:37:00,940 --> 00:37:01,890
We don't know.

863
00:37:01,890 --> 00:37:03,807
So if that's going to
be a problem for us,

864
00:37:03,807 --> 00:37:06,390
what we don't want to actually
just free the head of the list,

865
00:37:06,390 --> 00:37:10,230
we want to make sure we're going
through and freeing nodes step by step

866
00:37:10,230 --> 00:37:13,840
and keeping track of where
the rest of our nodes are.

867
00:37:13,840 --> 00:37:16,735
So instead, one way we could
approach this is as follows.

868
00:37:16,735 --> 00:37:19,110
We don't want to do this, just
free the head of the list.

869
00:37:19,110 --> 00:37:22,110
What we could do, instead, is
say, let's make a new pointer,

870
00:37:22,110 --> 00:37:23,527
let's add that to the equation.

871
00:37:23,527 --> 00:37:24,110
Let's do this.

872
00:37:24,110 --> 00:37:27,450
So let's say that-- let's make a
new pointer, just called maybe ptr,

873
00:37:27,450 --> 00:37:28,570
for example.

874
00:37:28,570 --> 00:37:33,550
And that will point to whatever the
head of our list is, the next node.

875
00:37:33,550 --> 00:37:37,290
So now, a pointer is pointing
to that second node in our list.

876
00:37:37,290 --> 00:37:38,992
And now, we could safely free list.

877
00:37:38,992 --> 00:37:39,700
We could do this.

878
00:37:39,700 --> 00:37:42,480
We could say, OK, free
up list, and that's gone.

879
00:37:42,480 --> 00:37:49,390
But what is still pointing to the
rest of our list now, which variable?

880
00:37:49,390 --> 00:37:52,140
Maybe in the chat?

881
00:37:52,140 --> 00:37:52,640
Yeah.

882
00:37:52,640 --> 00:37:55,080
So ptr is pointing to that
second node in our list.

883
00:37:55,080 --> 00:37:57,170
So we still have this in our memory.

884
00:37:57,170 --> 00:38:00,250
We know where all the
rest of our nodes are.

885
00:38:00,250 --> 00:38:00,750
OK.

886
00:38:00,750 --> 00:38:01,580
So that's all good.

887
00:38:01,580 --> 00:38:04,855
But now, we want to actually
get back to the head of our list

888
00:38:04,855 --> 00:38:07,230
and make sure that we're
actually kind of moving forward.

889
00:38:07,230 --> 00:38:10,910
So what we'll do is say that
list gets the value for ptr.

890
00:38:10,910 --> 00:38:14,030
So we're going to update what our
list looks like and say that we're

891
00:38:14,030 --> 00:38:16,410
going to move this over here.

892
00:38:16,410 --> 00:38:19,550
So now, the head of our list points
to whatever pointer points to.

893
00:38:19,550 --> 00:38:21,470
And we could, perhaps,
do this in a loop.

894
00:38:21,470 --> 00:38:25,940
We could say, OK, in this case, well,
pointer will get list arrow next again.

895
00:38:25,940 --> 00:38:29,950
And what would pointer be now, though?

896
00:38:29,950 --> 00:38:33,250
If pointer is list arrow next,
what is pointer going to be?

897
00:38:33,250 --> 00:38:36,490


898
00:38:36,490 --> 00:38:38,950
It will be the second node of
the node after the first one.

899
00:38:38,950 --> 00:38:40,970
But what's the actual value right now?

900
00:38:40,970 --> 00:38:41,470
Yeah.

901
00:38:41,470 --> 00:38:45,460
It'll be null, or [INAUDIBLE]
because notice how right now list

902
00:38:45,460 --> 00:38:48,220
is pointing to this
very, now, the first node

903
00:38:48,220 --> 00:38:50,900
in our list, whose next value is null.

904
00:38:50,900 --> 00:38:54,670
So if pointer equals list arrow
next, well, pointer will be null.

905
00:38:54,670 --> 00:38:56,870
OK, that seems OK for now.

906
00:38:56,870 --> 00:39:01,840
But if we wanted to kind of keep going,
let's say we free list, all right.

907
00:39:01,840 --> 00:39:04,150
And then we say, well,
list gets pointer.

908
00:39:04,150 --> 00:39:05,125
What is list now?

909
00:39:05,125 --> 00:39:07,830


910
00:39:07,830 --> 00:39:11,705
There's nothing left, so list
would also be null, right?

911
00:39:11,705 --> 00:39:13,830
So what we've done, we've
kind of gone full circle.

912
00:39:13,830 --> 00:39:17,820
We've made-- we've gone from the very
first node to our very last node,

913
00:39:17,820 --> 00:39:20,880
and we've ended when
list equals null, when

914
00:39:20,880 --> 00:39:25,020
there's nothing left to free, back at,
for example, the same place we started.

915
00:39:25,020 --> 00:39:28,590
If we go back to our code, we could
see that the very place we began

916
00:39:28,590 --> 00:39:32,040
was with list equaling
no, no nodes in our list.

917
00:39:32,040 --> 00:39:38,760
And now, if we unload them, we also want
to end in that very same place, right?

918
00:39:38,760 --> 00:39:39,703
OK.

919
00:39:39,703 --> 00:39:42,870
So let's try implementing this in code
now so we can get rid of those memory

920
00:39:42,870 --> 00:39:43,370
errors.

921
00:39:43,370 --> 00:39:47,070
If we go down below here, we're going
to free all of our allocated nodes.

922
00:39:47,070 --> 00:39:49,570
And we know we want some
kind of loop to do this.

923
00:39:49,570 --> 00:39:51,598
But let's hold off on that for now.

924
00:39:51,598 --> 00:39:54,390
We know, first, we want to make
this new pointer to our next nodes.

925
00:39:54,390 --> 00:39:55,015
We can do this.

926
00:39:55,015 --> 00:40:00,180
Node star pointer equals whatever
list is pointing to and getting

927
00:40:00,180 --> 00:40:02,400
that next value there.

928
00:40:02,400 --> 00:40:06,230
Then we could safely free list because
pointers pointing the rest of our list.

929
00:40:06,230 --> 00:40:09,800
And, finally, we could
say, list updates to 0.2,

930
00:40:09,800 --> 00:40:12,610
what pointer is now pointing to.

931
00:40:12,610 --> 00:40:15,080
But how long do we want to do this?

932
00:40:15,080 --> 00:40:18,170
How do we know when to stop?

933
00:40:18,170 --> 00:40:21,120
Maybe in the chat?

934
00:40:21,120 --> 00:40:25,221
This seems good for one node,
but want to keep doing this.

935
00:40:25,221 --> 00:40:28,190
Yeah, we want to keep doing
it until our list is null.

936
00:40:28,190 --> 00:40:30,680
And we don't really know how
many times we want to iterate,

937
00:40:30,680 --> 00:40:32,630
so a while loop seems
like a good choice.

938
00:40:32,630 --> 00:40:35,880
A while loop we can do something
until something else is true.

939
00:40:35,880 --> 00:40:40,190
So we could say, while
list doesn't equal null.

940
00:40:40,190 --> 00:40:43,340
While we still have
nodes, essentially, let's

941
00:40:43,340 --> 00:40:47,960
go ahead and make sure that we
have a pointer to the next node.

942
00:40:47,960 --> 00:40:50,240
Let's free the current
node, and then let's

943
00:40:50,240 --> 00:40:53,280
update the current node to point
to whatever it was previously,

944
00:40:53,280 --> 00:40:54,590
the next one.

945
00:40:54,590 --> 00:40:57,503
So we could do something like this.

946
00:40:57,503 --> 00:41:00,170
And now, if we run our code, if
we go back to our terminal here,

947
00:41:00,170 --> 00:41:02,550
we'll make list.

948
00:41:02,550 --> 00:41:06,040
We'll run valgrind on ./list.

949
00:41:06,040 --> 00:41:12,990
And what do we get if we type in,
let's say, Hi! here, and then Hello?

950
00:41:12,990 --> 00:41:15,930
We see, successfully, all
heap blocks were freed.

951
00:41:15,930 --> 00:41:19,460
No leaks are possible down below here.

952
00:41:19,460 --> 00:41:23,880
So what questions are there on
this, if you look at our code

953
00:41:23,880 --> 00:41:25,290
for unloading here?

954
00:41:25,290 --> 00:41:30,420


955
00:41:30,420 --> 00:41:32,640
Yeah, a good question,
is nil the same as no?

956
00:41:32,640 --> 00:41:36,210
So if we look at our terminal
and we actually run this again,

957
00:41:36,210 --> 00:41:40,800
call it a make list, I'll
run ./list, and I'll say Hi!

958
00:41:40,800 --> 00:41:44,280
And notice how this very first
node has the next pointer of nil.

959
00:41:44,280 --> 00:41:46,690
Nil is simply the terminal
representation of null.

960
00:41:46,690 --> 00:41:48,795
So nil means null in the terminal.

961
00:41:48,795 --> 00:41:51,640


962
00:41:51,640 --> 00:41:53,260
All right.

963
00:41:53,260 --> 00:41:57,140
And now, I'm seeing another question,
is this a recursive function?

964
00:41:57,140 --> 00:42:00,370
So this still is probably
not a recursive function,

965
00:42:00,370 --> 00:42:04,720
because a recursive function would kind
of call itself like this function might

966
00:42:04,720 --> 00:42:07,520
then call unload somewhere in here.

967
00:42:07,520 --> 00:42:10,135
So it's not quite recursive,
but you are correct in noticing

968
00:42:10,135 --> 00:42:12,010
that this is a good
application for recursion

969
00:42:12,010 --> 00:42:16,220
because we're doing the very same
thing for every node as we go.

970
00:42:16,220 --> 00:42:18,670
So you could certainly
write this recursively.

971
00:42:18,670 --> 00:42:21,430
Here, we're going to
iteratively using a loop.

972
00:42:21,430 --> 00:42:25,500


973
00:42:25,500 --> 00:42:26,520
OK.

974
00:42:26,520 --> 00:42:27,630
Other questions on this?

975
00:42:27,630 --> 00:42:35,430


976
00:42:35,430 --> 00:42:35,930
OK.

977
00:42:35,930 --> 00:42:38,550
So let's keep going on.

978
00:42:38,550 --> 00:42:39,950
And we've seen linked lists now.

979
00:42:39,950 --> 00:42:41,810
We've seen these really
two core operations

980
00:42:41,810 --> 00:42:44,960
for linked lists, like, how
do we add some data to them,

981
00:42:44,960 --> 00:42:47,360
and how do we unload them?

982
00:42:47,360 --> 00:42:50,318
What you might ask, though,
is how can we speed this up?

983
00:42:50,318 --> 00:42:51,860
And that's where hash tables come in.

984
00:42:51,860 --> 00:42:55,037
So hash tables are simply
an array of linked lists.

985
00:42:55,037 --> 00:42:57,370
We take linked lists, we break
them into smaller pieces.

986
00:42:57,370 --> 00:42:59,120
And so let's get a
visual for that and see

987
00:42:59,120 --> 00:43:00,990
if we can make this run even faster.

988
00:43:00,990 --> 00:43:06,260
So if we go back to our slides here
and we see this idea of a linked list,

989
00:43:06,260 --> 00:43:07,370
we have maybe, Hey!

990
00:43:07,370 --> 00:43:08,060
Hello!

991
00:43:08,060 --> 00:43:08,840
Lo there!

992
00:43:08,840 --> 00:43:12,560
Different kind of activation
words for our voice assistant.

993
00:43:12,560 --> 00:43:15,890
Well, what we could do is try
to make sure we can access

994
00:43:15,890 --> 00:43:18,470
certain parts of this linked
list faster to make sure

995
00:43:18,470 --> 00:43:20,870
we can actually search
this linked list faster.

996
00:43:20,870 --> 00:43:22,580
And we could for that use a hash table.

997
00:43:22,580 --> 00:43:25,890
So recall from earlier that a
hash table looks a bit like this.

998
00:43:25,890 --> 00:43:28,310
We have some array on
the left-hand side.

999
00:43:28,310 --> 00:43:31,370
And we have different, what
we call, buckets in here,

1000
00:43:31,370 --> 00:43:35,180
where a bucket might, in this
case, be a letter of the alphabet.

1001
00:43:35,180 --> 00:43:38,750
And that letter of the alphabet is
maybe the first character in a word.

1002
00:43:38,750 --> 00:43:43,850
You could have other buckets, other ways
of assigning these phrases to buckets.

1003
00:43:43,850 --> 00:43:46,400
But for now, let's focus
on this alphabetical hash

1004
00:43:46,400 --> 00:43:48,740
function here and
alphabetical hash table.

1005
00:43:48,740 --> 00:43:52,640
So with this, we have different
letters for every hash bucket.

1006
00:43:52,640 --> 00:44:00,560
And how does C know where in
the array H, I, J, K and L are?

1007
00:44:00,560 --> 00:44:03,090
If we wanted to add some nodes
to a certain bucket, well,

1008
00:44:03,090 --> 00:44:05,380
how do we figure out
where that bucket is?

1009
00:44:05,380 --> 00:44:07,540
Anyone in the chat want
to chime in on that?

1010
00:44:07,540 --> 00:44:11,670


1011
00:44:11,670 --> 00:44:12,170
Yes.

1012
00:44:12,170 --> 00:44:13,400
So it seems to be that they're ordered.

1013
00:44:13,400 --> 00:44:15,483
They're certainly in the
right alphabetical order.

1014
00:44:15,483 --> 00:44:17,470
That's helpful.

1015
00:44:17,470 --> 00:44:21,790
Other things, too, for how we know
where H is located, or L is located?

1016
00:44:21,790 --> 00:44:24,950


1017
00:44:24,950 --> 00:44:28,350
Yes, we could use maybe
ASCII numbers and so on.

1018
00:44:28,350 --> 00:44:30,560
And we're on the right track here.

1019
00:44:30,560 --> 00:44:34,160
But because if you look at the
left-hand side, this is just an array,

1020
00:44:34,160 --> 00:44:38,420
and we've kind of ourselves as
humans assigned those array buckets,

1021
00:44:38,420 --> 00:44:41,030
those array segments, some letter.

1022
00:44:41,030 --> 00:44:44,000
But we have to assign them some
number, really, some index.

1023
00:44:44,000 --> 00:44:48,930
And so you might recall from very early
on, all arrays have these indices,

1024
00:44:48,930 --> 00:44:49,430
right?

1025
00:44:49,430 --> 00:44:52,340
So maybe zero to however
big your array is.

1026
00:44:52,340 --> 00:44:56,540
And now we have, in this case,
maybe H corresponds to the 7 index

1027
00:44:56,540 --> 00:45:00,410
in our table, or I corresponds
to the 8 index, and so on.

1028
00:45:00,410 --> 00:45:04,550
And so in those indexes, we can
actually add in our linked list.

1029
00:45:04,550 --> 00:45:09,110
And so here, if we notice
that H corresponds to 7,

1030
00:45:09,110 --> 00:45:12,500
well, H is the eighth
letter of the alphabet.

1031
00:45:12,500 --> 00:45:15,020
But if we were counting from
zero, it would be the seventh.

1032
00:45:15,020 --> 00:45:19,100
So in this case, H corresponds to zero
as that seventh letter of the alphabet

1033
00:45:19,100 --> 00:45:22,010
counting from zero here.

1034
00:45:22,010 --> 00:45:26,340
That's how we know how we actually
add in these linked list nodes.

1035
00:45:26,340 --> 00:45:31,060
Now, if I were to give you some new
phrase, not Hey! or Hello! or Lo there!

1036
00:45:31,060 --> 00:45:35,290
How would you figure out
where to put it in this case?

1037
00:45:35,290 --> 00:45:37,920
Maybe in the chat?

1038
00:45:37,920 --> 00:45:42,500
If I gave you some new
phrase, what could you

1039
00:45:42,500 --> 00:45:44,300
do to figure out where to put it?

1040
00:45:44,300 --> 00:45:49,560


1041
00:45:49,560 --> 00:45:50,060
Yeah.

1042
00:45:50,060 --> 00:45:51,730
So I'm seeing maybe at the start.

1043
00:45:51,730 --> 00:45:54,140
I don't really know until
we see the phrase, right?

1044
00:45:54,140 --> 00:45:57,450
And that's where this idea
of a hash function comes in.

1045
00:45:57,450 --> 00:45:59,750
So in order to figure
out where Hey! and Hello!

1046
00:45:59,750 --> 00:46:04,670
and Lo there! go, we need to have some
kind of function that takes our phrase

1047
00:46:04,670 --> 00:46:07,320
and gives us back some number
that tells us where to put it.

1048
00:46:07,320 --> 00:46:09,200
So here we have, maybe, the phrase Hey!

1049
00:46:09,200 --> 00:46:11,150
And our hash function
should ideally give us

1050
00:46:11,150 --> 00:46:14,910
some number that tells us
where to put that given phrase.

1051
00:46:14,910 --> 00:46:17,960
So if I gave you not Hey!
or Hi! or Hello there,

1052
00:46:17,960 --> 00:46:20,360
maybe I gave you the name Carter.

1053
00:46:20,360 --> 00:46:23,630
Well, that should give us some
new number that then corresponds

1054
00:46:23,630 --> 00:46:27,140
to an index in our array, or
we could start this new linked

1055
00:46:27,140 --> 00:46:30,420
list that might have all
the C words, for example.

1056
00:46:30,420 --> 00:46:33,140
So let's actually take a look
at this kind of in code now.

1057
00:46:33,140 --> 00:46:38,990
If we go back to our Visual Studio code
over here, I'll open up this new file,

1058
00:46:38,990 --> 00:46:40,250
this one called table.

1059
00:46:40,250 --> 00:46:42,590
So I'll code table.c.

1060
00:46:42,590 --> 00:46:44,310
And it's already a bit done for me.

1061
00:46:44,310 --> 00:46:46,680
But notice how we have
a few different things.

1062
00:46:46,680 --> 00:46:50,540
We have the same node structure from
before for making our linked lists.

1063
00:46:50,540 --> 00:46:54,498
And now, we have this new
array of pointers to node.

1064
00:46:54,498 --> 00:46:55,790
And we're calling this a table.

1065
00:46:55,790 --> 00:46:59,570
And notice how we have
26 buckets, 26 places

1066
00:46:59,570 --> 00:47:01,370
in which to have a linked list now.

1067
00:47:01,370 --> 00:47:04,070
And we can figure out
where these new phrases

1068
00:47:04,070 --> 00:47:09,330
should go by running this function
called hash that we'll see down below.

1069
00:47:09,330 --> 00:47:10,650
So let's scroll down.

1070
00:47:10,650 --> 00:47:13,100
We see that, here, we have
this function called hash.

1071
00:47:13,100 --> 00:47:15,350
And hash is right now
just returning zero.

1072
00:47:15,350 --> 00:47:18,290
Because if I give it a new phrase,
I haven't really finished this yet.

1073
00:47:18,290 --> 00:47:20,660
I just want to return zero
for every phrase I get.

1074
00:47:20,660 --> 00:47:24,590
And ideally, if I run this
program, I should type in a phrase

1075
00:47:24,590 --> 00:47:29,330
and see the index that I should place
this phrase inside of my hash table,

1076
00:47:29,330 --> 00:47:33,920
and then print out, OK, this phrase
hash is to whatever number I have.

1077
00:47:33,920 --> 00:47:35,010
So let's see an example.

1078
00:47:35,010 --> 00:47:38,820
So if I go to my terminal now,
I could make this full screen

1079
00:47:38,820 --> 00:47:42,120
and I could simply type, make a table.

1080
00:47:42,120 --> 00:47:46,200
And now, I'll run ./table,
and I'll type in Hi!

1081
00:47:46,200 --> 00:47:48,330
And I get zero, which is maybe OK.

1082
00:47:48,330 --> 00:47:50,400
But let's try now--

1083
00:47:50,400 --> 00:47:54,830
let's try Lo there! as we
saw before, and we get zero.

1084
00:47:54,830 --> 00:47:57,200
So why would this be bad?

1085
00:47:57,200 --> 00:48:01,700
Besides it being unfinished, why
would we not want Hi! and Lo there!

1086
00:48:01,700 --> 00:48:03,380
to kind of hash to that same value?

1087
00:48:03,380 --> 00:48:07,830


1088
00:48:07,830 --> 00:48:09,090
Any ideas in the chat?

1089
00:48:09,090 --> 00:48:16,180


1090
00:48:16,180 --> 00:48:16,680
Yeah.

1091
00:48:16,680 --> 00:48:20,683
So I'm seeing that this is kind
of similar to a linked list,

1092
00:48:20,683 --> 00:48:21,600
if we were to do this.

1093
00:48:21,600 --> 00:48:23,220
Like, if we have--

1094
00:48:23,220 --> 00:48:24,840
back to our visual here.

1095
00:48:24,840 --> 00:48:30,527
If we have some array of linked
lists, but every word hashes to zero,

1096
00:48:30,527 --> 00:48:32,985
well, we have all this other
space to put our linked lists,

1097
00:48:32,985 --> 00:48:35,935
but we're just putting everything
in the zero bucket, which it makes--

1098
00:48:35,935 --> 00:48:39,060
turns us back into this picture, which
is just a single linked list, right?

1099
00:48:39,060 --> 00:48:42,098
The goal of this hash function is
to have many shorter linked lists,

1100
00:48:42,098 --> 00:48:43,890
and that's the goal of
trying to put things

1101
00:48:43,890 --> 00:48:46,210
in different buckets,
different indexes here.

1102
00:48:46,210 --> 00:48:49,650
So ideally, we don't want to have
everything hash to the same value,

1103
00:48:49,650 --> 00:48:52,620
we want to have different hash
values for different phrases

1104
00:48:52,620 --> 00:48:54,630
that we might input.

1105
00:48:54,630 --> 00:48:57,390
So one way to do this, as
we're seeing right now,

1106
00:48:57,390 --> 00:49:03,120
is trying to make sure we have different
linked lists that depend on the very

1107
00:49:03,120 --> 00:49:05,165
first letter of the phrase we add.

1108
00:49:05,165 --> 00:49:08,290
And let's go back to our hash function
and figure out how we could do that.

1109
00:49:08,290 --> 00:49:10,570
So let's exit our program here.

1110
00:49:10,570 --> 00:49:12,310
And let's go ahead and go back here.

1111
00:49:12,310 --> 00:49:18,780
And let's try to implement hash so that
we get a value 0-25 for a given phrase.

1112
00:49:18,780 --> 00:49:19,780
So here, let's try this.

1113
00:49:19,780 --> 00:49:24,030
Let's actually try
returning not just zero,

1114
00:49:24,030 --> 00:49:29,010
but let's say, depending on the
very first letter of our phrase,

1115
00:49:29,010 --> 00:49:31,790
we return that and see how that goes.

1116
00:49:31,790 --> 00:49:34,720
So here we have returning
phrase, bracket zero.

1117
00:49:34,720 --> 00:49:36,310
And we'll go back to our terminal now.

1118
00:49:36,310 --> 00:49:44,680
We'll say make table, and then
./table, and let's try Hello!

1119
00:49:44,680 --> 00:49:46,540
And we see 72.

1120
00:49:46,540 --> 00:49:48,430
Is that-- I mean, it's better.

1121
00:49:48,430 --> 00:49:49,970
Right now, I have actual number.

1122
00:49:49,970 --> 00:49:54,510
And if I type in Carter, well, that's
a different number, which is good.

1123
00:49:54,510 --> 00:49:57,460
And Lo there! perhaps, is
also a different number.

1124
00:49:57,460 --> 00:49:59,610
So we're off to a good start.

1125
00:49:59,610 --> 00:50:01,710
But what's wrong with this?

1126
00:50:01,710 --> 00:50:06,240
If I get 72, and 67, and
76, I go back to my code

1127
00:50:06,240 --> 00:50:09,570
and I look at my table, what's wrong?

1128
00:50:09,570 --> 00:50:13,800


1129
00:50:13,800 --> 00:50:14,300
Yeah.

1130
00:50:14,300 --> 00:50:16,500
So I'm seeing I only have 26 buckets.

1131
00:50:16,500 --> 00:50:21,080
So if I wanted to, let's say, get
a phrase and put it in my bucket

1132
00:50:21,080 --> 00:50:25,160
somewhere, well, I can't really do
that because maybe Hello! hashes to,

1133
00:50:25,160 --> 00:50:26,540
what was it if we look at it?

1134
00:50:26,540 --> 00:50:27,650
72.

1135
00:50:27,650 --> 00:50:30,840
Well, that's not a
valid index in my table.

1136
00:50:30,840 --> 00:50:33,830
So I need to actually make
this shorter a little bit.

1137
00:50:33,830 --> 00:50:37,520
And one way to do that is
maybe just subtract A from it.

1138
00:50:37,520 --> 00:50:38,970
So I could do something like this.

1139
00:50:38,970 --> 00:50:41,540
And what this will do
is, maybe if you recall

1140
00:50:41,540 --> 00:50:47,290
how A maps to 65, well, if I get A
as the very first character, well,

1141
00:50:47,290 --> 00:50:49,850
what is 65?

1142
00:50:49,850 --> 00:50:52,100
A minus A is zero.

1143
00:50:52,100 --> 00:50:54,770
If I get B, B is 66 in ASCII.

1144
00:50:54,770 --> 00:50:57,560
So B minus A is 1.

1145
00:50:57,560 --> 00:51:02,000
And if I get all the way down to Z,
well, Z minus a is going to be 25.

1146
00:51:02,000 --> 00:51:04,740
And if I wrap back around to A,
well, that's just zero again.

1147
00:51:04,740 --> 00:51:07,580
So what I've done now is I've
made sure that every index I get

1148
00:51:07,580 --> 00:51:11,940
is going to be inside the kind
of length of my table here.

1149
00:51:11,940 --> 00:51:15,500
I won't get any indexes that
are out of bounds for this list.

1150
00:51:15,500 --> 00:51:21,330
And now, if I kind of go back here,
we run this code, I'll do make table.

1151
00:51:21,330 --> 00:51:26,580
I'll run ./table, and I'll do Hello!
now, and I get 7, which is better.

1152
00:51:26,580 --> 00:51:28,470
Now, I'll do maybe Lo there!

1153
00:51:28,470 --> 00:51:29,490
and I get 11.

1154
00:51:29,490 --> 00:51:33,670
And now, I'll type in just Z to
see how that works, and I get 25.

1155
00:51:33,670 --> 00:51:35,550
So that seems to be
pretty good for myself.

1156
00:51:35,550 --> 00:51:39,520
But what still am I missing here?

1157
00:51:39,520 --> 00:51:42,340
What could go wrong?

1158
00:51:42,340 --> 00:51:43,060
Any ideas?

1159
00:51:43,060 --> 00:51:49,220


1160
00:51:49,220 --> 00:51:49,720
Yes.

1161
00:51:49,720 --> 00:51:52,240
So lowercased letters--
right now, I'm working

1162
00:51:52,240 --> 00:51:53,830
with phrases that are all capital.

1163
00:51:53,830 --> 00:51:56,538
But let's say someone wants to
add a phrase that is in lowercase.

1164
00:51:56,538 --> 00:52:00,230
So I could actually open a
terminal again, make table.

1165
00:52:00,230 --> 00:52:04,740
I'll do ./table, and I'll do
maybe hello! in lowercase.

1166
00:52:04,740 --> 00:52:07,920
And I get 39, which
isn't quite what I want.

1167
00:52:07,920 --> 00:52:09,800
So what could I do instead?

1168
00:52:09,800 --> 00:52:13,538
Maybe any ideas in the chat?

1169
00:52:13,538 --> 00:52:14,455
What can I do instead?

1170
00:52:14,455 --> 00:52:19,360


1171
00:52:19,360 --> 00:52:21,920
I could, perhaps, maybe
standardize on capital letters.

1172
00:52:21,920 --> 00:52:25,930
So I could say, I want to make whatever
this first letter is uppercase,

1173
00:52:25,930 --> 00:52:29,890
and then subtract that capital A. So if
I go back to my terminal, I could say,

1174
00:52:29,890 --> 00:52:32,520
make table.

1175
00:52:32,520 --> 00:52:34,050
I could say ./table.

1176
00:52:34,050 --> 00:52:38,310
And now I want for it to be the case
that lowercase hello and capital

1177
00:52:38,310 --> 00:52:40,060
Hello! hash that same value.

1178
00:52:40,060 --> 00:52:45,430
So I'll type hello here, and capital H
Hello! here, and I get 7 in both cases.

1179
00:52:45,430 --> 00:52:48,550
So that seems to work for me now.

1180
00:52:48,550 --> 00:52:51,270
All right.

1181
00:52:51,270 --> 00:52:56,240
Any other-- if we go back
to our terminal here,

1182
00:52:56,240 --> 00:52:59,540
that will tell us where
to put individual phrases.

1183
00:52:59,540 --> 00:53:04,190
So questions on this
hash function so far?

1184
00:53:04,190 --> 00:53:12,760


1185
00:53:12,760 --> 00:53:13,260
All right.

1186
00:53:13,260 --> 00:53:15,990


1187
00:53:15,990 --> 00:53:18,520
So not seeing too many questions here.

1188
00:53:18,520 --> 00:53:19,820
Why don't we do this?

1189
00:53:19,820 --> 00:53:24,490
So let's go ahead and think about
what makes for a good hash function?

1190
00:53:24,490 --> 00:53:29,490
So if we think about this as
our very first hash function,

1191
00:53:29,490 --> 00:53:33,418
this idea of putting nodes
into individual buckets,

1192
00:53:33,418 --> 00:53:35,710
well, it's not going to be
quite what we want it to be.

1193
00:53:35,710 --> 00:53:37,627
And if we actually go
through down here and we

1194
00:53:37,627 --> 00:53:39,420
think about how could
we try to make sure

1195
00:53:39,420 --> 00:53:42,180
we actually better hash function
than this, what we could do

1196
00:53:42,180 --> 00:53:45,640
is actually make sure we have a good
distribution of values across it.

1197
00:53:45,640 --> 00:53:48,510
So if you think about
our buckets here, we

1198
00:53:48,510 --> 00:53:51,930
want to try to have-- to make sure
that every node in our linked list

1199
00:53:51,930 --> 00:53:54,510
is kind of going to go
in every bucket here.

1200
00:53:54,510 --> 00:53:58,050
Like, we actually have a lot of I words,
a lot of J words, a lot of K words.

1201
00:53:58,050 --> 00:54:01,110
And here, the way we hash this
isn't going to quite do that for us.

1202
00:54:01,110 --> 00:54:04,350
We're not going to actually have any
words that start with I, or J, or K,

1203
00:54:04,350 --> 00:54:09,130
so we probably think of a new
hash function to use in this case.

1204
00:54:09,130 --> 00:54:13,183
And if we go back over here, we
can think of some good points

1205
00:54:13,183 --> 00:54:14,100
for our hash function.

1206
00:54:14,100 --> 00:54:18,000
We can think of, OK, we always try to
get the same value for a given input.

1207
00:54:18,000 --> 00:54:20,880
We always want to produce an even
distribution across the buckets,

1208
00:54:20,880 --> 00:54:23,970
and we always want to make sure that
we use all the buckets that we're

1209
00:54:23,970 --> 00:54:26,300
given in this case.

1210
00:54:26,300 --> 00:54:28,488
So these are some ideas
for a good hash function.

1211
00:54:28,488 --> 00:54:30,280
Here, what we'll actually
do is let you all

1212
00:54:30,280 --> 00:54:34,927
go off and write your own hash functions
to go off and build your own overall.

1213
00:54:34,927 --> 00:54:36,760
So if we think of some
other ideas here, you

1214
00:54:36,760 --> 00:54:40,150
could use the length of the phrases,
you could make sure you maybe

1215
00:54:40,150 --> 00:54:42,170
add up their values, and so on.

1216
00:54:42,170 --> 00:54:44,830
But what we could do is
go ahead and have you all

1217
00:54:44,830 --> 00:54:47,920
do that for the rest of this week now.

1218
00:54:47,920 --> 00:54:51,940
And I think that will probably
conclude us for this section.

1219
00:54:51,940 --> 00:54:53,690
It is so good to see you all here today.

1220
00:54:53,690 --> 00:54:54,898
Thank you all for joining us.

1221
00:54:54,898 --> 00:54:57,660
We will see you next week.

1222
00:54:57,660 --> 00:54:59,000