WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

00:00:02.862 --> 00:00:03.910
CARTER ZENKE: OK.

00:00:03.910 --> 00:00:04.940
Hello, everyone.

00:00:04.940 --> 00:00:07.982
It is so good to see you
here for our Week 5 Section.

00:00:07.982 --> 00:00:09.940
If you don't remember,
my name is Carter Zenke.

00:00:09.940 --> 00:00:11.740
I'm the course's
preceptor here on campus,

00:00:11.740 --> 00:00:13.902
and it's so delightful
to see you all over Zoom.

00:00:13.902 --> 00:00:15.860
I thought what we'd do
just to kick things off,

00:00:15.860 --> 00:00:18.880
assuming you can hear me this
time, is to on the count of three,

00:00:18.880 --> 00:00:22.150
we'll all unmute at the same
time and say a big hello.

00:00:22.150 --> 00:00:25.440
So on the count of three, we'll
do one, two, three, hello.

00:00:25.440 --> 00:00:26.830
AUDIENCE: Hello.

00:00:26.830 --> 00:00:28.195
AUDIENCE: Hello.

00:00:28.195 --> 00:00:29.455
AUDIENCE: Hello.

00:00:29.455 --> 00:00:31.640
AUDIENCE: Hello.

00:00:31.640 --> 00:00:32.800
CARTER ZENKE: All right.

00:00:32.800 --> 00:00:33.220
AUDIENCE: Hello.

00:00:33.220 --> 00:00:34.585
CARTER ZENKE: It is so wonderful--

00:00:34.585 --> 00:00:35.252
AUDIENCE: Hello.

00:00:35.252 --> 00:00:36.481
AUDIENCE: Hello.

00:00:36.481 --> 00:00:37.690
AUDIENCE: Hello.

00:00:37.690 --> 00:00:38.710
CARTER ZENKE: Hello.

00:00:38.710 --> 00:00:40.533
It is so wonderful to see you all here.

00:00:40.533 --> 00:00:41.200
AUDIENCE: Hello.

00:00:41.200 --> 00:00:42.968
AUDIENCE: Hello.

00:00:42.968 --> 00:00:46.280
CARTER ZENKE: And we'll
go ahead and mute.

00:00:46.280 --> 00:00:48.760
AUDIENCE: Hello.

00:00:48.760 --> 00:00:52.000
CARTER ZENKE: OK.

00:00:52.000 --> 00:00:52.500
All right.

00:00:52.500 --> 00:00:55.208
So if you'd like to participate,
feel free to do so via the chat.

00:00:55.208 --> 00:00:58.405
You're welcome to type in the chat
to ask questions via the chat.

00:00:58.405 --> 00:01:00.780
Again, the goal for these
sections is really to make sure

00:01:00.780 --> 00:01:03.090
that you're able to ask the
questions you want to ask

00:01:03.090 --> 00:01:06.250
and for us to dive into
this learning together.

00:01:06.250 --> 00:01:08.460
So today, focused on data structures.

00:01:08.460 --> 00:01:12.540
And this week, you build your own data
structure with a speller problem set.

00:01:12.540 --> 00:01:16.170
And I want to focus on this big
idea behind data structures.

00:01:16.170 --> 00:01:17.087
Why would we use them?

00:01:17.087 --> 00:01:18.962
What are the trade-offs
between them, and how

00:01:18.962 --> 00:01:20.800
to actually use them in the real world.

00:01:20.800 --> 00:01:23.680
So to that end, we'll have three
questions during our section today.

00:01:23.680 --> 00:01:27.360
The first will be this idea of the
trade-off between data structures.

00:01:27.360 --> 00:01:30.180
When should we use one data
structure versus another?

00:01:30.180 --> 00:01:32.790
The second question will be
about these linked lists.

00:01:32.790 --> 00:01:36.660
We saw these in lecture, but how
can we build our own linked lists?

00:01:36.660 --> 00:01:37.920
What can we do with them?

00:01:37.920 --> 00:01:42.480
And how can we better figure out how
to actually use them in our code?

00:01:42.480 --> 00:01:44.670
And then, finally, our
third question will

00:01:44.670 --> 00:01:48.230
be this idea of how do we build our
own hash table where hash tables rely

00:01:48.230 --> 00:01:50.730
on a lot of fundamentals that
linked lists can give us here,

00:01:50.730 --> 00:01:52.688
but can run maybe a little
bit faster, as we'll

00:01:52.688 --> 00:01:55.900
see in the problem set for this week.

00:01:55.900 --> 00:01:59.220
So to really dive into things here and
to make things a little more concrete,

00:01:59.220 --> 00:02:01.500
we saw all kinds of data structures.

00:02:01.500 --> 00:02:04.780
But often, when you're actually
programming in the real world,

00:02:04.780 --> 00:02:08.220
you might choose one or another,
depending on the scenario, what

00:02:08.220 --> 00:02:09.730
you're trying to build something.

00:02:09.730 --> 00:02:12.480
And in this case, what I'd like
to do is show you all one scenario

00:02:12.480 --> 00:02:13.690
we can think about here.

00:02:13.690 --> 00:02:17.220
So imagine that you are working
for some software company

00:02:17.220 --> 00:02:21.190
and they are making this digital
assistant that runs on a mobile device.

00:02:21.190 --> 00:02:23.610
Maybe it's something like,
hey, Siri, or hey, Google,

00:02:23.610 --> 00:02:25.680
or something that
responds to people's voice

00:02:25.680 --> 00:02:27.630
over time on their personal device.

00:02:27.630 --> 00:02:31.230
You can do something for that person
as they activate this assistant.

00:02:31.230 --> 00:02:35.430
Now, you've seen, or you've heard, that
people who are using this assistant

00:02:35.430 --> 00:02:38.640
aren't often able to wake
it up at the right moment.

00:02:38.640 --> 00:02:42.053
Maybe it sometimes doesn't work
if I were to say maybe, hey, Siri,

00:02:42.053 --> 00:02:42.720
or, hey, Google.

00:02:42.720 --> 00:02:44.830
It doesn't quite work as I want it to.

00:02:44.830 --> 00:02:47.953
And so this is sort of a problem
that we're running into here.

00:02:47.953 --> 00:02:50.370
But what we ultimately want
to do is figure out, OK, well,

00:02:50.370 --> 00:02:51.690
how could we fix this?

00:02:51.690 --> 00:02:54.750
And maybe the first solution
is to try to improve this

00:02:54.750 --> 00:02:58.200
by adding more words that could
wake up this voice assistant.

00:02:58.200 --> 00:03:02.280
We want to store more data, more
words, that could activate this voice

00:03:02.280 --> 00:03:03.330
assistant for ourselves.

00:03:03.330 --> 00:03:06.372
And the question that you might run
into as a software engineer is, well,

00:03:06.372 --> 00:03:08.820
how could we best store
all of these phrases

00:03:08.820 --> 00:03:11.282
we want our voice
assistant to wake up to?

00:03:11.282 --> 00:03:12.990
What kind of data
structure would you use

00:03:12.990 --> 00:03:17.190
to actually store these phrases that
might activate our voice assistant?

00:03:17.190 --> 00:03:21.090
And we saw a few tools in our toolkit,
a few structures in our toolkit.

00:03:21.090 --> 00:03:25.590
And one of these was the linked list,
one of these was the try, one of these

00:03:25.590 --> 00:03:27.370
was the tree, and so on.

00:03:27.370 --> 00:03:29.260
So there's lots of ways we can do this.

00:03:29.260 --> 00:03:32.550
But let's think first about the
different priorities we might

00:03:32.550 --> 00:03:34.980
have for this data structure, right?

00:03:34.980 --> 00:03:36.822
If we want to actually make this.

00:03:36.822 --> 00:03:38.530
Well, what do we want
to prioritize here?

00:03:38.530 --> 00:03:39.698
So let's think about this.

00:03:39.698 --> 00:03:41.490
If we to add lots of
words, we could really

00:03:41.490 --> 00:03:43.650
think of trying a few
different approaches here.

00:03:43.650 --> 00:03:46.023
We could think of, well, a
few different aspects, like,

00:03:46.023 --> 00:03:48.690
if our data structure is going
to do a certain number of things,

00:03:48.690 --> 00:03:51.300
we might care about different aspects
of how data structures are formed.

00:03:51.300 --> 00:03:53.400
Like, do we care about
deleting things quickly?

00:03:53.400 --> 00:03:55.170
Do we care about
inserting things quickly?

00:03:55.170 --> 00:03:57.128
Do we care about searching
for phrases quickly?

00:03:57.128 --> 00:03:59.520
Like, for example, if I were
to have this voice assistant

00:03:59.520 --> 00:04:03.990
and I'm trying to do something with it,
do I care about deleting words quickly,

00:04:03.990 --> 00:04:07.230
or do I care about
inserting new words quickly,

00:04:07.230 --> 00:04:10.830
or do I care about maybe if I were
to speak something, if I can search

00:04:10.830 --> 00:04:12.270
and find that word really quickly?

00:04:12.270 --> 00:04:14.220
So often, when we're
choosing data structure,

00:04:14.220 --> 00:04:15.928
we care about these
kind of three things.

00:04:15.928 --> 00:04:18.820
How fast does it perform
in these three areas?

00:04:18.820 --> 00:04:21.328
And so here, we run into
our first question, right?

00:04:21.328 --> 00:04:23.370
If we want to figure out
which way to prioritize,

00:04:23.370 --> 00:04:26.320
well, we could maybe say that
we care about search the most.

00:04:26.320 --> 00:04:28.920
We want to be able to search and
find a phrase really quickly.

00:04:28.920 --> 00:04:32.190
Maybe if the user speaks some word,
we want to be able to figure out,

00:04:32.190 --> 00:04:35.640
is that word in the wake word list.

00:04:35.640 --> 00:04:37.590
Or maybe we really care
about the user being

00:04:37.590 --> 00:04:41.010
able to quickly add a new wake word,
like, I could say not just, hey,

00:04:41.010 --> 00:04:42.450
but hello, or hi.

00:04:42.450 --> 00:04:45.450
And I want to be able to program this
data structure to sort of respond

00:04:45.450 --> 00:04:47.200
to that quickly.

00:04:47.200 --> 00:04:51.180
Now, what priorities might you have
if you were faced with a scenario?

00:04:51.180 --> 00:04:53.130
If you want to chime
in in the chat, which

00:04:53.130 --> 00:04:56.490
would you prioritize here between
a fast insertion, a fast search,

00:04:56.490 --> 00:04:59.380
a fast deletion, if
you're trying to do this?

00:04:59.380 --> 00:05:02.390
Feel free to chime in in the chat.

00:05:02.390 --> 00:05:02.890
OK.

00:05:02.890 --> 00:05:09.880
I'm seeing one person for search, OK,
lots of searching, some insertion, too.

00:05:09.880 --> 00:05:10.380
Yeah.

00:05:10.380 --> 00:05:13.380
And so these are things we could
probably reasonably disagree on, right?

00:05:13.380 --> 00:05:16.078
If I want to prioritize
searching for some word,

00:05:16.078 --> 00:05:19.120
being able to speak it and have my
assistant recognize it really quickly,

00:05:19.120 --> 00:05:22.037
well, that seems reasonable where
we can make search our top priority.

00:05:22.037 --> 00:05:24.400
Or maybe it's also
reasonable to say, well,

00:05:24.400 --> 00:05:26.723
I want people to make their
own custom wake words.

00:05:26.723 --> 00:05:29.140
And so inserting something is
really important to me, too.

00:05:29.140 --> 00:05:30.820
So it depends on what
you're trying to build here.

00:05:30.820 --> 00:05:33.600
And so let's take a look at some
of these tools in our tool kit.

00:05:33.600 --> 00:05:37.020
So the first one we said before
was this idea of the linked list.

00:05:37.020 --> 00:05:39.437
And here's the linked list in full here.

00:05:39.437 --> 00:05:41.520
And the idea behind the
linked list was that we're

00:05:41.520 --> 00:05:45.352
able to have these nodes where
each node has a certain phrase

00:05:45.352 --> 00:05:48.060
inside of it-- at least in this
case, trying to store phrases now

00:05:48.060 --> 00:05:49.020
in our linked list.

00:05:49.020 --> 00:05:51.760
And we have these nodes kind
of follow each other in memory.

00:05:51.760 --> 00:05:53.590
They're no longer back-to-back,
like, in an array,

00:05:53.590 --> 00:05:54.780
but they sort of build
on top of each other

00:05:54.780 --> 00:05:57.360
and sort of follow each other
throughout memory, being woven

00:05:57.360 --> 00:05:59.940
in to our bits and bytes, thereof.

00:05:59.940 --> 00:06:05.850
So if we were to use a linked list
here, which of these three priorities

00:06:05.850 --> 00:06:08.130
would be kind of
well-served by this linked

00:06:08.130 --> 00:06:12.983
list, like, what would a link list be
good at in this case, do you think?

00:06:12.983 --> 00:06:14.275
Feel free to chime in the chat.

00:06:17.910 --> 00:06:18.410
OK.

00:06:18.410 --> 00:06:19.850
So I'm seeing some insertion.

00:06:19.850 --> 00:06:21.330
And, yeah, it's correct.

00:06:21.330 --> 00:06:23.720
So linked would be really
fast to insert something into.

00:06:23.720 --> 00:06:27.500
In fact, we saw in lecture that all
it takes to insert into a linked list

00:06:27.500 --> 00:06:30.588
is to simply make a new node
and put it at the very front.

00:06:30.588 --> 00:06:32.630
And that's, basically, a
constant time operation.

00:06:32.630 --> 00:06:34.100
It takes a fixed number of steps.

00:06:34.100 --> 00:06:36.290
So inserting into
linked list is very fast

00:06:36.290 --> 00:06:39.447
and could be a good priority
for our insertion here.

00:06:39.447 --> 00:06:41.030
But, again, what's the trade-off here?

00:06:41.030 --> 00:06:42.738
If we were to go with
a linked list, what

00:06:42.738 --> 00:06:46.130
would be slower because we
used the linked list now?

00:06:46.130 --> 00:06:49.130
Maybe insertion is fast,
but what's not fast?

00:06:49.130 --> 00:06:51.507
Well, deletion is not very fast, right?

00:06:51.507 --> 00:06:53.840
But at least in this scenario,
let's say we don't really

00:06:53.840 --> 00:06:57.290
want to care about deleting wake words,
what's the other ones we can use,

00:06:57.290 --> 00:07:00.435
a little bit slow for us?

00:07:00.435 --> 00:07:00.935
Yeah.

00:07:00.935 --> 00:07:02.102
So I'm seeing search, right?

00:07:02.102 --> 00:07:04.100
Search would be pretty
slow because, again, we

00:07:04.100 --> 00:07:06.410
have to sort of start
at the very first node.

00:07:06.410 --> 00:07:09.442
Like, hey, and ask, OK, is
this what the user said?

00:07:09.442 --> 00:07:11.150
No, it's not, let's
go onto the next one.

00:07:11.150 --> 00:07:12.380
Oh, is this what the user said?

00:07:12.380 --> 00:07:14.088
Oh, it's not, let's
go onto the next one.

00:07:14.088 --> 00:07:16.778
So we can only do linear search,
basically, with a linked list.

00:07:16.778 --> 00:07:19.070
And this is a bit of
simplification for how we actually

00:07:19.070 --> 00:07:20.330
do kind of voice recognition.

00:07:20.330 --> 00:07:22.205
But you get the idea,
nonetheless, that we're

00:07:22.205 --> 00:07:25.040
trying to find and go through phrases
that the user could have said.

00:07:25.040 --> 00:07:27.840
In a linked list, we have to go
through every single node that we have,

00:07:27.840 --> 00:07:29.632
which, as I see in the
chat is indeed going

00:07:29.632 --> 00:07:32.220
to be in big O of n
going through n nodes

00:07:32.220 --> 00:07:35.440
to find a certain element inside of it.

00:07:35.440 --> 00:07:36.960
OK so we have linked lists here.

00:07:36.960 --> 00:07:39.040
But let's actually think
of a new data structure.

00:07:39.040 --> 00:07:41.802
What if we use something
like the hash table?

00:07:41.802 --> 00:07:43.510
And the hash table
looks a bit like this.

00:07:43.510 --> 00:07:46.020
I'm going to go to the
full screen mode here where

00:07:46.020 --> 00:07:47.760
it's a bit like a linked list.

00:07:47.760 --> 00:07:49.620
But what we've done
is sort of categorize

00:07:49.620 --> 00:07:52.242
our linked list based on
the very first character.

00:07:52.242 --> 00:07:54.450
And not all hash tables use
the very first character,

00:07:54.450 --> 00:07:58.260
but this one, for example, does,
and it is helpful in some ways.

00:07:58.260 --> 00:08:02.540
Like what would now be faster
because we're using the hash table?

00:08:02.540 --> 00:08:05.368
What would be faster now?

00:08:05.368 --> 00:08:06.660
Feel free to write in the chat.

00:08:08.940 --> 00:08:09.440
OK.

00:08:09.440 --> 00:08:11.810
So I'm seeing some search,
like, search is faster.

00:08:11.810 --> 00:08:15.770
And why is search faster, if you
want to write in the chat quickly?

00:08:15.770 --> 00:08:18.425
Why would search be faster
with this hash table now?

00:08:24.120 --> 00:08:26.790
So I'm seeing some of these
ideas of indexing, right?

00:08:26.790 --> 00:08:27.660
And so-- right.

00:08:27.660 --> 00:08:31.770
If we look at the side of our hash table
in this array on this left-hand side

00:08:31.770 --> 00:08:36.659
here, we've kind of split our linked
list into several indexes, like, H, I,

00:08:36.659 --> 00:08:38.010
J, K and L.

00:08:38.010 --> 00:08:41.400
And in the computer's memory, these
are represented with, actually, like,

00:08:41.400 --> 00:08:44.100
H, I, J, K and L. They're
assigned some number.

00:08:44.100 --> 00:08:46.770
And we have a hash function
that finds the number for each

00:08:46.770 --> 00:08:48.400
of these values on the left-hand side.

00:08:48.400 --> 00:08:51.430
But the idea here is that we're
able to use indexing to figure out,

00:08:51.430 --> 00:08:54.180
OK, if I want to find the
words beginning with H,

00:08:54.180 --> 00:08:57.878
I know right where to go, and
I can only search those words.

00:08:57.878 --> 00:09:01.170
Or if I want to find the words beginning
with L, I can search in that location,

00:09:01.170 --> 00:09:03.630
just find the words beginning
with L. So indexing here

00:09:03.630 --> 00:09:05.422
is helping us breaking
down our linked list

00:09:05.422 --> 00:09:08.170
into lots of smaller linked lists, too.

00:09:08.170 --> 00:09:12.070
So it seems if we were to
insert into this hash table,

00:09:12.070 --> 00:09:13.380
it's also pretty quick, right?

00:09:13.380 --> 00:09:16.770
It's both faster searching
and faster insertion.

00:09:16.770 --> 00:09:20.590
So why would we just not always use
a hash table over a linked list?

00:09:20.590 --> 00:09:22.590
Like what's the problem
with the hash table now?

00:09:22.590 --> 00:09:23.730
What's the trade-off here?

00:09:26.060 --> 00:09:26.560
Yeah.

00:09:26.560 --> 00:09:28.320
So I'm seeing this idea of memory.

00:09:28.320 --> 00:09:31.570
Like this hash table is going to take
up much more memory than our linked list

00:09:31.570 --> 00:09:35.100
assuming, maybe, we have some buckets,
some indexes that aren't really filled

00:09:35.100 --> 00:09:36.770
with these phrases.

00:09:36.770 --> 00:09:38.520
But it's also going
to take us more memory

00:09:38.520 --> 00:09:40.582
to have all the different
indexes possible here

00:09:40.582 --> 00:09:43.540
that we might want to actually have
in order to make our lookup really,

00:09:43.540 --> 00:09:44.370
really fast.

00:09:44.370 --> 00:09:47.640
So it's a trade-off
here is memory, as well.

00:09:47.640 --> 00:09:49.850
And now, it's got one other
structure to consider,

00:09:49.850 --> 00:09:52.010
which would be this idea of the trie.

00:09:52.010 --> 00:09:54.240
And here's some brief visual of a trie.

00:09:54.240 --> 00:09:57.980
Would you like to point out some
words you see inside of this trie,

00:09:57.980 --> 00:09:59.253
reading from left to right?

00:09:59.253 --> 00:10:00.170
What words do you see?

00:10:00.170 --> 00:10:02.960
You see hello, you see hey.

00:10:02.960 --> 00:10:04.400
Any other ones?

00:10:04.400 --> 00:10:07.910
We see help, certainly.

00:10:07.910 --> 00:10:09.980
Yeah, there's a few
other ones in here, too.

00:10:09.980 --> 00:10:15.260
So this trie worked by trying of storing
a node for every letter in our words.

00:10:15.260 --> 00:10:17.390
And this is very fast for lookup, right?

00:10:17.390 --> 00:10:19.910
If you want to figure out
if hello is in this trie,

00:10:19.910 --> 00:10:24.980
well, all we have to do is see, OK,
I see H, I see E, I see L, I see L,

00:10:24.980 --> 00:10:25.640
and I see O.

00:10:25.640 --> 00:10:27.327
Well, hello is in this structure, right?

00:10:27.327 --> 00:10:28.410
We know that it's in Here.

00:10:28.410 --> 00:10:30.320
So look up is very fast.

00:10:30.320 --> 00:10:32.840
And similarly, insertion
is also really quickly.

00:10:32.840 --> 00:10:36.860
If I wanted to add Hi to this trie,
well, I could simply go to the H node

00:10:36.860 --> 00:10:39.410
and say, OK, well, that
covers the first letter.

00:10:39.410 --> 00:10:42.380
Now, I'll add I, maybe somewhere
up above, and that's it.

00:10:42.380 --> 00:10:43.280
I've inserted Hi.

00:10:43.280 --> 00:10:46.658
So insertion and search are
also really fast in our trie.

00:10:46.658 --> 00:10:48.200
But again, what's the trade-off here?

00:10:48.200 --> 00:10:49.700
Why wouldn't we always use the trie?

00:10:53.850 --> 00:10:54.350
Yeah.

00:10:54.350 --> 00:10:55.910
So I'm seeing this idea of memory again.

00:10:55.910 --> 00:10:57.980
Like, this is going to
take up a lot more memory

00:10:57.980 --> 00:11:00.770
to store all the combinations
of all the letters

00:11:00.770 --> 00:11:03.860
that could make up all the words
we might want to have as our wake

00:11:03.860 --> 00:11:05.810
word for our assistant, right?

00:11:05.810 --> 00:11:08.062
And even beyond that, beyond
the computer's memory,

00:11:08.062 --> 00:11:09.770
if you think about
the time it would take

00:11:09.770 --> 00:11:12.620
to make a trie, a more
complex data structure,

00:11:12.620 --> 00:11:15.170
well, that's real human time
that you have to work on this,

00:11:15.170 --> 00:11:18.120
and might actually delay your
shipment of some new feature, right?

00:11:18.120 --> 00:11:18.620
And

00:11:18.620 --> 00:11:21.770
So it's going to matter, not just how
much time it takes for the computer,

00:11:21.770 --> 00:11:24.290
but also how much time it takes
you to make the structure,

00:11:24.290 --> 00:11:28.670
if you wanted to actually make
some feature for your own program,

00:11:28.670 --> 00:11:31.020
or your own company in the end.

00:11:31.020 --> 00:11:33.440
So questions on these structures so far?

00:11:33.440 --> 00:11:35.720
Happy to take some in the
chat, if you have any.

00:11:42.530 --> 00:11:43.835
What questions do you have?

00:11:48.370 --> 00:11:50.300
A question about, does
AI use data structures?

00:11:50.300 --> 00:11:50.800
Yes.

00:11:50.800 --> 00:11:53.980
So a lot of how AI
can work so quickly is

00:11:53.980 --> 00:11:57.410
because we choose to organize data
in these really efficient ways.

00:11:57.410 --> 00:12:00.310
And so by getting creative with
the ways we actually make our data

00:12:00.310 --> 00:12:03.190
structures beyond even just
linked lists and tries and trees,

00:12:03.190 --> 00:12:06.247
we can actually make
AI a lot more quickly--

00:12:06.247 --> 00:12:08.080
run a lot more quickly
because it has access

00:12:08.080 --> 00:12:10.450
to this data structure
laid out in such a way

00:12:10.450 --> 00:12:12.380
that it can run quickly in the end.

00:12:12.380 --> 00:12:16.175
So, yes, it makes use of a lot of
different data structures, too.

00:12:16.175 --> 00:12:18.550
And this question about memory
being so cheap these days.

00:12:18.550 --> 00:12:19.010
It is true.

00:12:19.010 --> 00:12:20.302
So memory is very cheap, right?

00:12:20.302 --> 00:12:23.200
If we wanted to make
the trie here, maybe we

00:12:23.200 --> 00:12:27.280
don't care how much memory it takes
up because memory is so cheap.

00:12:27.280 --> 00:12:30.130
But it also takes us time
to allocate that memory,

00:12:30.130 --> 00:12:31.840
to put things in that memory.

00:12:31.840 --> 00:12:34.030
And so, perhaps, it's
not always the case

00:12:34.030 --> 00:12:37.240
that because we have so much memory, we
should just be kind of loose with it.

00:12:37.240 --> 00:12:38.440
We really need to take
care that we're not

00:12:38.440 --> 00:12:41.050
using too much memory when we
could, theoretically, maybe

00:12:41.050 --> 00:12:43.760
use less, if we wanted to.

00:12:43.760 --> 00:12:44.800
Nice.

00:12:44.800 --> 00:12:45.680
OK.

00:12:45.680 --> 00:12:48.410
So this is some of the ideas
of trade-offs here, choosing

00:12:48.410 --> 00:12:51.480
one or the other, often going
to cost us one thing or another.

00:12:51.480 --> 00:12:53.550
But, really, behind
all of these structures

00:12:53.550 --> 00:12:58.100
is the idea of a node, where, if we
look back at our structures here,

00:12:58.100 --> 00:13:01.670
we had maybe a trie, we had
a link, we had a hash table,

00:13:01.670 --> 00:13:02.810
and we had a linked list.

00:13:02.810 --> 00:13:07.520
And it was composing all of these
was this idea of some kind of node,

00:13:07.520 --> 00:13:09.110
some place we store data.

00:13:09.110 --> 00:13:12.690
And along with that, a pointer
to some other pieces of data.

00:13:12.690 --> 00:13:16.100
So we saw last week, in
Week 4, that pointers are

00:13:16.100 --> 00:13:18.380
helpful for pointing certain locations.

00:13:18.380 --> 00:13:20.900
Well, now, we're seeing the
application of those pointers.

00:13:20.900 --> 00:13:23.025
Why would we even want
pointers in the first place?

00:13:23.025 --> 00:13:25.860
Because we can build these much
more complex data structures.

00:13:25.860 --> 00:13:29.797
So let's think about building these
very first building block for these data

00:13:29.797 --> 00:13:31.130
sets, which is called this node.

00:13:31.130 --> 00:13:33.130
And in our code that we'll
see in just a minute,

00:13:33.130 --> 00:13:36.318
we have this new idea of
creating this node in code.

00:13:36.318 --> 00:13:38.360
And we've seen this kind
of syntax before, right?

00:13:38.360 --> 00:13:41.480
Here is the code we would use to
make a structure called a node.

00:13:41.480 --> 00:13:44.212
And we're seeing that
same syntax for structs,

00:13:44.212 --> 00:13:45.920
like, we created a
struct for candidates,

00:13:45.920 --> 00:13:47.630
we created a struct for people earlier.

00:13:47.630 --> 00:13:50.660
Now, we're using that same
syntax to create a node.

00:13:50.660 --> 00:13:52.080
And what do you notice here?

00:13:52.080 --> 00:13:53.870
So first, we're going
to look at the top.

00:13:53.870 --> 00:13:56.160
And C reads this top to bottom.

00:13:56.160 --> 00:13:58.260
So as it reads this code,
it'll do the following.

00:13:58.260 --> 00:14:00.890
It'll say, OK, here I'm
seeing we're going to define

00:14:00.890 --> 00:14:03.590
a new structure called a node.

00:14:03.590 --> 00:14:05.330
That seems OK so far.

00:14:05.330 --> 00:14:08.000
And then inside of the
structure called node,

00:14:08.000 --> 00:14:10.520
we're going to ensure we have
the following pieces of data.

00:14:10.520 --> 00:14:11.990
We're going to make
sure we have a phrase.

00:14:11.990 --> 00:14:14.780
If we look at this, we're going
to have space for a phrase that

00:14:14.780 --> 00:14:16.260
is going to be a string.

00:14:16.260 --> 00:14:21.320
And then we also have some space
for a pointer to a node called next.

00:14:21.320 --> 00:14:23.660
And, again, we can store,
like, Hi or bye in here.

00:14:23.660 --> 00:14:27.800
But we also have some new space
for next, a pointer to a node.

00:14:27.800 --> 00:14:31.970
And that could have a real address
in it, like, 0x123, or 0x456.

00:14:31.970 --> 00:14:35.750
But it has to point to
some kind of node, right?

00:14:35.750 --> 00:14:39.290
So now inside this node, we
have a phrase and a next portion

00:14:39.290 --> 00:14:42.560
currently empty, but we could
put this data inside of it.

00:14:42.560 --> 00:14:45.593
And then once we finish reading this,
we're going to say, all of this,

00:14:45.593 --> 00:14:48.260
everything we've done so far, the
phrase, the next portions that

00:14:48.260 --> 00:14:51.350
are-- all of that to be wrapped
up into this thing called a node.

00:14:51.350 --> 00:14:54.230
So we've created kind of here
a template for what a node is.

00:14:54.230 --> 00:14:57.650
A node has a phrase in it
and some next pointer in it.

00:14:57.650 --> 00:14:59.955
And we're going to call
all of that a node.

00:14:59.955 --> 00:15:01.080
Now, you could change this.

00:15:01.080 --> 00:15:02.913
You can make sure it
has an integer, you can

00:15:02.913 --> 00:15:04.940
make sure it has some
character inside of it.

00:15:04.940 --> 00:15:08.270
You could even store more attributes,
if you'd like, inside a single node.

00:15:08.270 --> 00:15:11.570
But for now, I'll focus on phrases,
trying to store these activation

00:15:11.570 --> 00:15:13.790
words for our voice assistant for now.

00:15:13.790 --> 00:15:15.860
Now, questions on this structure so far?

00:15:19.080 --> 00:15:23.310
Yes, I'm seeing a question, is the next
location inside the computer's memory?

00:15:23.310 --> 00:15:25.410
Is it actually in the byte below?

00:15:25.410 --> 00:15:27.960
Like here in our picture
here, we see that the phrase

00:15:27.960 --> 00:15:31.770
portion is kind of first, and
then comes the next portion.

00:15:31.770 --> 00:15:34.277
Are these two really next
to each other in memory?

00:15:34.277 --> 00:15:36.360
And the answer for that
is, generally, like, yeah.

00:15:36.360 --> 00:15:38.170
So these are going to be very
close to-- they're in memory.

00:15:38.170 --> 00:15:40.337
They're going to be kind
of right next to each other

00:15:40.337 --> 00:15:43.290
as we create this node throughout.

00:15:43.290 --> 00:15:46.240
And a question about, well,
could we have a before field,

00:15:46.240 --> 00:15:47.580
like, a pointer to a before?

00:15:47.580 --> 00:15:48.580
And we absolutely could.

00:15:48.580 --> 00:15:50.740
So if we wanted to modify
this, we could simply

00:15:50.740 --> 00:15:53.620
add a new attribute called
before, and we could theoretically

00:15:53.620 --> 00:15:55.570
point to linked lists that are--

00:15:55.570 --> 00:15:58.172
to nodes that are behind us,
as well, in our current node.

00:15:58.172 --> 00:16:00.880
And that allows us to make more
even more complex data structures

00:16:00.880 --> 00:16:01.750
overall, too.

00:16:05.790 --> 00:16:08.860
Now, a question here, we see a
struct node if we go back here.

00:16:08.860 --> 00:16:10.980
So struct node, star, next.

00:16:10.980 --> 00:16:13.972
Is that a pointer to the struct,
or just a simple pointer?

00:16:13.972 --> 00:16:15.930
And the answer here is
a pointer to the struct.

00:16:15.930 --> 00:16:18.660
So here, we're seeing that
this is a node pointer.

00:16:18.660 --> 00:16:20.800
And a node, technically, is a structure.

00:16:20.800 --> 00:16:23.010
So we're going to have a
pointer to that node that

00:16:23.010 --> 00:16:26.190
is itself a structure combined
with a phrase and another pointer

00:16:26.190 --> 00:16:27.340
to another node.

00:16:27.340 --> 00:16:30.860
So good questions there.

00:16:30.860 --> 00:16:32.780
All right.

00:16:32.780 --> 00:16:34.790
Any other questions on
this structure here?

00:16:41.410 --> 00:16:42.820
OK.

00:16:42.820 --> 00:16:46.750
So that all seems pretty good so
far, and kind of a simple template

00:16:46.750 --> 00:16:47.470
for a node.

00:16:47.470 --> 00:16:50.050
But keep in mind that just
this code doesn't actually

00:16:50.050 --> 00:16:51.150
create the node for us.

00:16:51.150 --> 00:16:53.650
It gives us a template for the
node, but we haven't actually

00:16:53.650 --> 00:16:55.037
created it in code yet.

00:16:55.037 --> 00:16:57.370
So let's actually go ahead
and see how we might do that.

00:16:57.370 --> 00:17:00.610
If you were to go ahead and create
our own linked list, let's say.

00:17:00.610 --> 00:17:04.000
Like, let's say, out of all these three
structures, we chose a linked list now.

00:17:04.000 --> 00:17:06.250
We could start by making
a single node and have

00:17:06.250 --> 00:17:08.900
that be at the head of our linked list.

00:17:08.900 --> 00:17:11.440
So to create a linked
list, we'll often do this.

00:17:11.440 --> 00:17:14.470
We'll often create a
new pointer to a node

00:17:14.470 --> 00:17:18.220
and first set it equal to null
because it's pointing to nothing.

00:17:18.220 --> 00:17:20.560
There are no nodes in our linked list.

00:17:20.560 --> 00:17:23.500
But once we've done that, well, we
probably want to add at least one

00:17:23.500 --> 00:17:24.290
node, right?

00:17:24.290 --> 00:17:25.540
So let's go ahead and do that.

00:17:25.540 --> 00:17:27.790
Let's make a new pointer,
this one called n,

00:17:27.790 --> 00:17:30.500
and make sure we have some
space in which to store it.

00:17:30.500 --> 00:17:32.200
So let's read this from right to left.

00:17:32.200 --> 00:17:33.670
Let's call malloc first.

00:17:33.670 --> 00:17:37.160
And malloc will give us some space
in memory to actually put this node.

00:17:37.160 --> 00:17:39.250
So you can see at the
visual right below we

00:17:39.250 --> 00:17:41.860
have some new space
our program could use

00:17:41.860 --> 00:17:46.090
to actually store this node inside of
and add a phrase, or some next pointer,

00:17:46.090 --> 00:17:47.140
right?

00:17:47.140 --> 00:17:49.480
But now if we go all
the way to the left,

00:17:49.480 --> 00:17:53.920
we're going to see this pointer called
n will then point to that space.

00:17:53.920 --> 00:17:55.870
And notice how n is a type node star.

00:17:55.870 --> 00:18:00.260
It's pointing to some
space for a node, right?

00:18:00.260 --> 00:18:00.860
OK.

00:18:00.860 --> 00:18:02.783
So that makes our first node.

00:18:02.783 --> 00:18:05.450
But what we probably want to do
is actually add some data to it.

00:18:05.450 --> 00:18:07.050
Currently, it's empty, right?

00:18:07.050 --> 00:18:07.920
So what can we do?

00:18:07.920 --> 00:18:10.640
Well, we can actually use this
new syntax, this arrow syntax,

00:18:10.640 --> 00:18:12.780
to actually add in some data.

00:18:12.780 --> 00:18:18.050
So down below, we could say, n
arrow phrase gets the value high.

00:18:18.050 --> 00:18:19.050
And what does that mean?

00:18:19.050 --> 00:18:21.200
So if you look at our
visual here, we could see--

00:18:21.200 --> 00:18:23.960
OK, we see n, and we see an arrow.

00:18:23.960 --> 00:18:27.470
And now, we're going to go to the phrase
portion of this new piece of memory

00:18:27.470 --> 00:18:31.290
we've allocated and make sure
we put the phrase Hi in there.

00:18:31.290 --> 00:18:34.400
So n arrow phrase is saying,
start with that pointer

00:18:34.400 --> 00:18:36.950
and follow it to the
structure that you have,

00:18:36.950 --> 00:18:39.650
go to that phrase attribute,
that phrase portion,

00:18:39.650 --> 00:18:43.160
and make sure that says
Hi in the middle, right?

00:18:43.160 --> 00:18:45.320
And that's all fine and
good, but now, we also

00:18:45.320 --> 00:18:47.012
have to include the next portion.

00:18:47.012 --> 00:18:49.470
Like, currently it's empty,
but we want to have some value.

00:18:49.470 --> 00:18:52.430
And if this is our first
node, what's the logical value

00:18:52.430 --> 00:18:56.360
for this next attribute for our node?

00:18:56.360 --> 00:18:58.010
This is our very first node.

00:18:58.010 --> 00:18:59.120
What could we set it to?

00:19:01.680 --> 00:19:02.180
All right.

00:19:02.180 --> 00:19:03.472
So I'm seeing null in the chat.

00:19:03.472 --> 00:19:08.240
And the correct version is the
N-U-L-L, not N-U-L. N-U-L is nul,

00:19:08.240 --> 00:19:10.490
and the concept of character
is like a nul character.

00:19:10.490 --> 00:19:13.460
But N-U-L-L is null in
the context of an address.

00:19:13.460 --> 00:19:16.530
So this is an empty
address, zero address here.

00:19:16.530 --> 00:19:20.630
So if we say n arrow next gets the value
null, we'd see something like this.

00:19:20.630 --> 00:19:25.020
Here, we have our new node n, it's
phrase is Hi, its next pointer is null.

00:19:25.020 --> 00:19:28.160
It's the very first node in our list.

00:19:28.160 --> 00:19:34.130
So any questions on this so far,
actually, making up our first node?

00:19:34.130 --> 00:19:37.280
To your question on the arrow
syntax, so again, the arrow syntax

00:19:37.280 --> 00:19:40.680
is kind of best visualized by
actually reading from left to right.

00:19:40.680 --> 00:19:42.740
So if we look at the very
first thing, we see n.

00:19:42.740 --> 00:19:44.430
OK. n is a pointer.

00:19:44.430 --> 00:19:48.380
Now, the arrow says, let's follow
that pointer to our structure, right,

00:19:48.380 --> 00:19:49.980
because n is pointing to a structure.

00:19:49.980 --> 00:19:52.550
And let's go to the next
field, that next attribute,

00:19:52.550 --> 00:19:55.710
and set it equal to some value
we put on the right-hand side.

00:19:55.710 --> 00:20:00.050
So here we said, start with n, follow
n's pointer to our node structure,

00:20:00.050 --> 00:20:02.900
and go to that next attribute
and set that equal to null,

00:20:02.900 --> 00:20:05.950
as we see down below here.

00:20:05.950 --> 00:20:06.940
OK.

00:20:06.940 --> 00:20:09.400
So there's one more
step, though, which is

00:20:09.400 --> 00:20:11.757
our list is still looking pretty empty.

00:20:11.757 --> 00:20:14.090
We have our node, but our
list is still empty over here.

00:20:14.090 --> 00:20:17.500
So what could we do to actually
add this node to our linked list?

00:20:21.610 --> 00:20:28.445
What kind of code do we need here, or
where do we want list to point now?

00:20:32.908 --> 00:20:35.450
Feel free to type in the chat
if you think you might have it.

00:20:39.280 --> 00:20:41.620
So I'm seeing two
competing opinions here.

00:20:41.620 --> 00:20:44.910
So one opinion is that we
should say list equals n,

00:20:44.910 --> 00:20:47.010
which would be list gets the value n.

00:20:47.010 --> 00:20:51.040
And I'm seeing n equals list,
or n gets the value list.

00:20:51.040 --> 00:20:53.220
Which one do we think might work here?

00:20:53.220 --> 00:20:54.120
Let's try one.

00:20:54.120 --> 00:20:57.120
So let's say, first, that
list would get the value n.

00:20:57.120 --> 00:20:59.680
So list equals n, and let's try that.

00:20:59.680 --> 00:21:01.840
So here we have that up
here and let's execute it.

00:21:01.840 --> 00:21:06.720
And now we'll see, OK, we'll list
an n point to the same thing.

00:21:06.720 --> 00:21:08.940
Because n is a pointer
and list is a pointer.

00:21:08.940 --> 00:21:12.160
If we set list equal to n, though,
both point to the very same thing,

00:21:12.160 --> 00:21:13.703
which is our very first node.

00:21:13.703 --> 00:21:15.870
What would happen if we did
it the other way around?

00:21:15.870 --> 00:21:21.510
If we said n equals list, instead of
list equals n, where would n point to?

00:21:25.920 --> 00:21:27.877
Any ideas in the chat?

00:21:27.877 --> 00:21:28.960
So I'm seeing null, right?

00:21:28.960 --> 00:21:32.050
If list is currently null, it's
pointing to really nothing in general,

00:21:32.050 --> 00:21:34.480
well, if we say n
equals list, we're going

00:21:34.480 --> 00:21:38.680
to make sure that end points to null
and we'll have lost this node over here

00:21:38.680 --> 00:21:40.060
that we just had, right?

00:21:40.060 --> 00:21:41.948
So we probably don't want to do that.

00:21:41.948 --> 00:21:43.990
So this will be our last
step in adding our node.

00:21:43.990 --> 00:21:47.230
And now we have our
very first node inside

00:21:47.230 --> 00:21:50.070
of our linked list, which is great.

00:21:50.070 --> 00:21:52.575
Questions on this here,
too, before we keep going?

00:21:59.170 --> 00:22:01.660
OK.

00:22:01.660 --> 00:22:04.820
Not seeing too many here, so
why don't we keep on going.

00:22:04.820 --> 00:22:08.410
And if we want to kick off the very
first node, this is how we would do it.

00:22:08.410 --> 00:22:11.770
But let's say we wanted to actually
keep adding more and more nodes.

00:22:11.770 --> 00:22:14.500
We could start thinking about
inserting nodes to our linked list.

00:22:14.500 --> 00:22:16.910
And there are a few ways
we could do even that.

00:22:16.910 --> 00:22:19.517
So if we think back to our
picture as it is right now,

00:22:19.517 --> 00:22:21.850
let's say we wanted to create
this new space for a node.

00:22:21.850 --> 00:22:24.850
We could say, OK, I'm going to
have this new node called n,

00:22:24.850 --> 00:22:27.370
and I'm going to use malloc
to give me some space for it.

00:22:27.370 --> 00:22:28.610
So same thing we did before.

00:22:28.610 --> 00:22:32.200
We're going to say, get some
space, make sure n points to it.

00:22:32.200 --> 00:22:37.360
And now, how could we best
actually add this node to our list?

00:22:37.360 --> 00:22:40.345
Would we add it at the end maybe, or
the beginning, which do you think?

00:22:43.843 --> 00:22:46.760
Could we add n at the beginning or
the end, and which would be faster?

00:22:50.337 --> 00:22:51.920
So I'm seeing some competing opinions.

00:22:51.920 --> 00:22:54.900
I'm seeing end and I'm seeing beginning.

00:22:54.900 --> 00:22:57.270
Now, if you think about
a longer linked list,

00:22:57.270 --> 00:23:01.922
let's say we had not one node
but really like 50 nodes.

00:23:01.922 --> 00:23:03.380
Would we want to add it at the end?

00:23:03.380 --> 00:23:05.713
And if so, how many steps
would it take to add this node

00:23:05.713 --> 00:23:07.770
at the end of that linked list?

00:23:07.770 --> 00:23:10.470
If we have 50 nodes,
how many steps would it

00:23:10.470 --> 00:23:12.180
take to add this node at the end?

00:23:12.180 --> 00:23:13.185
It would take 50, right?

00:23:13.185 --> 00:23:15.810
And that reason is that you have
to actually find the last node

00:23:15.810 --> 00:23:16.780
and then add it there.

00:23:16.780 --> 00:23:20.190
So to find that last node, we have
to go through all 50 of our nodes.

00:23:20.190 --> 00:23:22.440
So if we want the very quick
insertion we talked about

00:23:22.440 --> 00:23:24.398
before, we have to add
a node at the beginning,

00:23:24.398 --> 00:23:26.310
right, make sure that
this new node points

00:23:26.310 --> 00:23:29.920
to the current first node in our list.

00:23:29.920 --> 00:23:31.237
So let's actually do that here.

00:23:31.237 --> 00:23:33.820
We have some space for a node,
we have a pointer to that node.

00:23:33.820 --> 00:23:35.590
But let's see how we might add it in.

00:23:35.590 --> 00:23:38.550
So we probably first give
this node a phrase, like, Hey.

00:23:38.550 --> 00:23:41.610
We'll say, OK, n arrow
phrase gets the value Hey.

00:23:41.610 --> 00:23:47.940
And now, we want to say, hum, where
should we point n's next pointer to?

00:23:47.940 --> 00:23:53.590
If we look at this picture here, where
should we point n's next pointer?

00:23:53.590 --> 00:23:56.270
In the chat, if you will.

00:23:56.270 --> 00:23:57.110
Yeah, to list.

00:23:57.110 --> 00:23:59.840
So we want to make sure
that this new node called

00:23:59.840 --> 00:24:03.690
n points to the very first node in our
list, that being called, well, just

00:24:03.690 --> 00:24:04.190
list.

00:24:04.190 --> 00:24:07.950
So here we'll say, OK, n arrow
next gets the value list.

00:24:07.950 --> 00:24:11.150
So now we're saying that n
is a node, it has the phrase,

00:24:11.150 --> 00:24:14.630
Hey, and its next arrow
is pointing to this list

00:24:14.630 --> 00:24:17.760
node, the very head of our list here.

00:24:17.760 --> 00:24:18.260
All right.

00:24:18.260 --> 00:24:21.770
But now, you'll notice that our
linked list is not quite in the state

00:24:21.770 --> 00:24:24.660
we want it to be in, like, list is
still pointing to our second node.

00:24:24.660 --> 00:24:26.368
So what's the next
step here, if you want

00:24:26.368 --> 00:24:30.550
to make sure our list now
includes this very first node?

00:24:30.550 --> 00:24:32.000
What should we do?

00:24:32.000 --> 00:24:34.400
We should point list to n,
as I'm seeing in the chat.

00:24:34.400 --> 00:24:37.232
So let's make sure we have
list pointing to n now.

00:24:37.232 --> 00:24:37.940
So we'll do this.

00:24:37.940 --> 00:24:40.450
We'll say, list equals n, and
we'll run this and we'll see,

00:24:40.450 --> 00:24:45.460
OK, list gets the value n, meaning
list now points to this new first node.

00:24:45.460 --> 00:24:49.150
And now we have our own linked
list of more than one node, right?

00:24:49.150 --> 00:24:50.440
We went from one to two.

00:24:50.440 --> 00:24:53.148
And now, I could even do the same
thing, three, or four, or five,

00:24:53.148 --> 00:24:55.900
even 500 times, if we wanted to.

00:24:55.900 --> 00:24:58.440
So questions on this,
inserting this new node here?

00:25:06.850 --> 00:25:08.990
OK.

00:25:08.990 --> 00:25:11.090
So not seeing too many here.

00:25:11.090 --> 00:25:13.200
Let's actually try to
translate this bit to code.

00:25:13.200 --> 00:25:15.075
So this is all fine and
good with our visual,

00:25:15.075 --> 00:25:16.770
but let's actually try it out in code.

00:25:16.770 --> 00:25:18.950
I'll go over to my Visual
Studio code over here.

00:25:18.950 --> 00:25:22.790
And here, I have my terminal,
a file already called list.c.

00:25:22.790 --> 00:25:25.097
So I'll code up list.c.

00:25:25.097 --> 00:25:27.180
And if we take a look at
this file, what we'll see

00:25:27.180 --> 00:25:29.960
is a very basic kind
of linked list file.

00:25:29.960 --> 00:25:32.000
Here at the top, I have my headers.

00:25:32.000 --> 00:25:34.785
And down below, I have my
structure called a node.

00:25:34.785 --> 00:25:36.660
And this is the same as
we saw before, right?

00:25:36.660 --> 00:25:39.950
We have a new structure called a
node that's going to hold a phrase,

00:25:39.950 --> 00:25:41.510
and it's going to have a
pointer to the next node.

00:25:41.510 --> 00:25:44.802
And we're going to call all of this node
throughout our code using this typedef

00:25:44.802 --> 00:25:46.280
syntax up here.

00:25:46.280 --> 00:25:49.610
Now, down below, we have some
functions to actually implement.

00:25:49.610 --> 00:25:51.920
One is called unload,
that we'll see in a bit,

00:25:51.920 --> 00:25:54.170
to actually make sure we're
not using too much memory,

00:25:54.170 --> 00:25:55.820
we're actually freeing it as we go.

00:25:55.820 --> 00:25:59.690
And the other down here in main is
actually add items or add phrases

00:25:59.690 --> 00:26:01.400
to our linked list.

00:26:01.400 --> 00:26:03.710
And as we do, we have this
function down here called

00:26:03.710 --> 00:26:06.230
visualize that I wrote
myself and should actually

00:26:06.230 --> 00:26:08.480
help you see what's happening
inside of our link list.

00:26:08.480 --> 00:26:10.897
Don't worry about all this
kind of scary syntax down here.

00:26:10.897 --> 00:26:13.350
We'll see what happens as we
add nodes to our linked list.

00:26:13.350 --> 00:26:16.450
So currently, if we wanted to add
some phrases to our linked list,

00:26:16.450 --> 00:26:19.200
well, we could just kind of run
this program and see what happens.

00:26:19.200 --> 00:26:20.210
Well, first, I could--

00:26:20.210 --> 00:26:21.890
let me bring up my terminal here.

00:26:21.890 --> 00:26:27.715
And I could say, I want
to make list like this.

00:26:27.715 --> 00:26:29.590
And let's go ahead and
make this full screen.

00:26:29.590 --> 00:26:31.780
I'll say ./list to run it.

00:26:31.780 --> 00:26:33.270
And now, what phrase should we add?

00:26:33.270 --> 00:26:35.270
Maybe we add Hi, exclamation point.

00:26:35.270 --> 00:26:36.390
So I'll hit Enter.

00:26:36.390 --> 00:26:39.480
And it doesn't show anything, right?

00:26:39.480 --> 00:26:40.740
Nothing's here yet.

00:26:40.740 --> 00:26:41.910
What if I did Hello!

00:26:41.910 --> 00:26:43.110
I could say Hello!

00:26:43.110 --> 00:26:44.800
And nothing's there.

00:26:44.800 --> 00:26:45.300
OK.

00:26:45.300 --> 00:26:47.550
But why is nothing there yet?

00:26:47.550 --> 00:26:49.905
In the chat, why is nothing here?

00:26:53.540 --> 00:26:54.750
Yeah, we didn't add anything.

00:26:54.750 --> 00:26:58.040
So if you go back to our
program here, we still

00:26:58.040 --> 00:27:01.820
have a to do, to add phrases
to a new node in this list.

00:27:01.820 --> 00:27:05.210
So here, if we look back,
very familiar syntax,

00:27:05.210 --> 00:27:07.190
we're creating our
very first linked list.

00:27:07.190 --> 00:27:09.200
And initially, this list is null.

00:27:09.200 --> 00:27:10.170
There's nothing in it.

00:27:10.170 --> 00:27:12.140
So that's why, when we
ran this program before,

00:27:12.140 --> 00:27:14.990
we didn't see anything
inside of our linked list.

00:27:14.990 --> 00:27:19.310
But now, what we're going to
do is iterate list size times

00:27:19.310 --> 00:27:23.450
to ask the user for some phrase
and add it to our linked list

00:27:23.450 --> 00:27:25.760
and visualize that addition.

00:27:25.760 --> 00:27:27.560
So how big is list size?

00:27:27.560 --> 00:27:28.940
Well, list size is 2.

00:27:28.940 --> 00:27:30.530
We'll iterate two times.

00:27:30.530 --> 00:27:34.110
And each time, we'll go ahead
and add some new phrase, right?

00:27:34.110 --> 00:27:37.340
So let's think first-- how are
we going to make this new node?

00:27:37.340 --> 00:27:40.020
We first probably want to
actually create some space for it.

00:27:40.020 --> 00:27:43.300
So how could we do that here?

00:27:43.300 --> 00:27:45.460
We're trying to add a new node now.

00:27:45.460 --> 00:27:47.230
So we're going to need malloc, OK.

00:27:47.230 --> 00:27:49.120
So we're going to need
malloc in some form.

00:27:49.120 --> 00:27:52.510
And how much space should
we ask for, do we know?

00:27:56.000 --> 00:27:58.750
So we want to create
some space for a node.

00:27:58.750 --> 00:28:00.685
And maybe a node is--

00:28:00.685 --> 00:28:03.130
I don't know-- I could guess
and say it's five bytes.

00:28:03.130 --> 00:28:04.780
But is that right?

00:28:04.780 --> 00:28:05.495
Or maybe it's 10.

00:28:05.495 --> 00:28:06.370
I mean, I don't know.

00:28:06.370 --> 00:28:07.700
It could be anything.

00:28:07.700 --> 00:28:10.360
We don't really quite
know, and so that's

00:28:10.360 --> 00:28:13.390
why we want to use this
function called sizeof.

00:28:13.390 --> 00:28:16.690
And we can pass sizeof a
certain type, like, node, right?

00:28:16.690 --> 00:28:19.530
I don't know how big a
node is, but sizeof does.

00:28:19.530 --> 00:28:25.162
It'll tell malloc how much memory to
create to have this node in place.

00:28:25.162 --> 00:28:27.370
But now that I've asked for
memory, I should probably

00:28:27.370 --> 00:28:29.650
keep track where that memory is.

00:28:29.650 --> 00:28:34.360
So I need some variable
to store that location

00:28:34.360 --> 00:28:37.260
memory, which might be called what?

00:28:37.260 --> 00:28:41.020
What could we store here?

00:28:41.020 --> 00:28:44.470
Maybe we could have a node pointer.

00:28:44.470 --> 00:28:45.490
Let's call it n.

00:28:45.490 --> 00:28:49.840
So we could say, OK, this is
going to point to some variable--

00:28:49.840 --> 00:28:52.630
point to some new piece
of memory called n.

00:28:52.630 --> 00:28:55.360
And we're going to make sure
it points to a node, right?

00:28:55.360 --> 00:28:59.060
This is going to be some new
chunk of memory where n points to.

00:28:59.060 --> 00:28:59.560
All right.

00:28:59.560 --> 00:29:02.780
So we do that kind of first bit, and
now we have some space for our node.

00:29:02.780 --> 00:29:04.480
But what do we need now?

00:29:04.480 --> 00:29:07.478
We need to make sure
that we add in a phrase.

00:29:07.478 --> 00:29:08.395
So how can we do that?

00:29:12.323 --> 00:29:14.740
We would probably use our arrow
syntax like we saw before.

00:29:14.740 --> 00:29:17.700
So we could say, n arrow phrase.

00:29:17.700 --> 00:29:20.370
And let's store the phrase
that we have right now inside.

00:29:20.370 --> 00:29:23.500
And similarly for the next
portion, we could do this.

00:29:23.500 --> 00:29:26.590
We could say n arrow
next gets the value.

00:29:26.590 --> 00:29:33.330
But what should it get, if
this is our maybe first node?

00:29:33.330 --> 00:29:36.760
I'm seeing a few different ideas here.

00:29:36.760 --> 00:29:38.520
So I'm seeing one for null.

00:29:38.520 --> 00:29:39.190
Let's try that.

00:29:39.190 --> 00:29:41.830
So we'll say, OK, n arrow next is null.

00:29:41.830 --> 00:29:42.810
It's our first node.

00:29:42.810 --> 00:29:45.330
And now what we want to
do is make sure that list

00:29:45.330 --> 00:29:48.910
points to that very first
node we just made up above.

00:29:48.910 --> 00:29:49.890
So let's try it.

00:29:49.890 --> 00:29:51.275
Let's go ahead and compile this.

00:29:51.275 --> 00:29:52.650
We'll open up our terminal again.

00:29:52.650 --> 00:29:55.160
We'll say, make list and--

00:29:55.160 --> 00:29:59.590
oops-- we forgot our
semicolon, so we'll do that.

00:29:59.590 --> 00:30:03.080
And we'll go back here
and we'll say, make list.

00:30:03.080 --> 00:30:05.030
And then we'll do ./list.

00:30:05.030 --> 00:30:07.960
And I'll say, OK, my first phrase is Hi!

00:30:07.960 --> 00:30:11.620
And that seems all right, like, I
have this new node at this location.

00:30:11.620 --> 00:30:13.180
It's phrase is Hi!

00:30:13.180 --> 00:30:14.900
And its next pointer is null.

00:30:14.900 --> 00:30:16.120
That seems OK.

00:30:16.120 --> 00:30:17.060
What if I did this?

00:30:17.060 --> 00:30:18.460
What if I said Hello!

00:30:18.460 --> 00:30:20.650
now.

00:30:20.650 --> 00:30:26.470
And I'm seeming to actually
be missing my very first node.

00:30:26.470 --> 00:30:27.730
It kind of went away.

00:30:27.730 --> 00:30:30.340
What happened here?

00:30:30.340 --> 00:30:31.930
Any ideas in the chat?

00:30:37.490 --> 00:30:37.990
Yeah.

00:30:37.990 --> 00:30:40.390
So I'm seeing we didn't
quite add this node.

00:30:40.390 --> 00:30:43.840
Because if we take a
look at our code, we're

00:30:43.840 --> 00:30:47.380
always setting the next pointer
for our new node to be null,

00:30:47.380 --> 00:30:48.440
and that means nothing.

00:30:48.440 --> 00:30:50.410
So whenever we add a new
node, well, we're just

00:30:50.410 --> 00:30:52.210
going to make sure that
the next pointer is null,

00:30:52.210 --> 00:30:53.110
like, it doesn't point to anything?

00:30:53.110 --> 00:30:54.068
That isn't quite right.

00:30:54.068 --> 00:30:58.570
So let's make sure we actually update
this to point to the head of our list.

00:30:58.570 --> 00:31:02.840
Our first node should point to whatever
is the first element in our list.

00:31:02.840 --> 00:31:06.850
So let's say this, n arrow
next gets the value list.

00:31:06.850 --> 00:31:09.130
And how will this work at the beginning?

00:31:09.130 --> 00:31:12.490
So notice how list is
null at the beginning.

00:31:12.490 --> 00:31:14.770
Then we'll go through
and add our first node.

00:31:14.770 --> 00:31:17.290
We'll say, OK, let's
get a phrase, and let's

00:31:17.290 --> 00:31:20.320
make sure we have some space
for this new node called n.

00:31:20.320 --> 00:31:23.860
Let's make sure this n has the
phrase that we've been typing in.

00:31:23.860 --> 00:31:28.060
And now, n arrow next is equal to
list which, in fact, is null, right?

00:31:28.060 --> 00:31:30.880
So our very first node
will still have null

00:31:30.880 --> 00:31:34.480
as its sort of beginning next pointer.

00:31:34.480 --> 00:31:36.640
And then we'll go ahead
and make sure list

00:31:36.640 --> 00:31:40.420
is updated to have the very
first node at the head.

00:31:40.420 --> 00:31:43.060
But then if we do it again,
if we go through again,

00:31:43.060 --> 00:31:46.120
we'll say that, OK, this list
pointer's next is actually pointing

00:31:46.120 --> 00:31:48.163
to the previous node we just created.

00:31:48.163 --> 00:31:50.830
So now I can actually have the
string together as we go through.

00:31:50.830 --> 00:31:52.038
And let's visualize this now.

00:31:52.038 --> 00:31:53.600
So let's go back to our terminal.

00:31:53.600 --> 00:31:58.110
Let's say, make list,
and let's do ./list.

00:31:58.110 --> 00:31:59.580
And we could say Hi!

00:31:59.580 --> 00:32:01.290
And then we could say Hello!

00:32:01.290 --> 00:32:05.730
And now we see our list being built.
So notice how our very first node here

00:32:05.730 --> 00:32:06.990
is at this location.

00:32:06.990 --> 00:32:08.520
It has the phrase Hello!

00:32:08.520 --> 00:32:11.860
And the next pointer is this, it's
pointing to some new location here.

00:32:11.860 --> 00:32:15.990
But notice how this is now
in our next node, 56f0.

00:32:15.990 --> 00:32:17.580
And the phrase here is Hi!

00:32:17.580 --> 00:32:21.170
And the next pointer is null here.

00:32:21.170 --> 00:32:22.590
OK.

00:32:22.590 --> 00:32:23.840
So that seemed to work for us.

00:32:23.840 --> 00:32:25.490
Let's go back to our code.

00:32:25.490 --> 00:32:27.785
What questions are there on this?

00:32:35.113 --> 00:32:35.780
A good question.

00:32:35.780 --> 00:32:37.660
So is this some kind of recursion?

00:32:37.660 --> 00:32:40.900
This isn't quite recursion
because, again, recursion

00:32:40.900 --> 00:32:43.640
involves having a function call itself.

00:32:43.640 --> 00:32:45.470
And here, we're really inside of a loop.

00:32:45.470 --> 00:32:47.230
So it isn't quite recursion.

00:32:47.230 --> 00:32:51.160
But you're onto an idea, which
is that every node that we add

00:32:51.160 --> 00:32:52.865
is going through that same process.

00:32:52.865 --> 00:32:55.240
So this really could potentially
be a recursive function,

00:32:55.240 --> 00:32:57.050
if you wanted it to be.

00:32:57.050 --> 00:33:00.010
But here, you're just noticing that
the same thing we're doing for one

00:33:00.010 --> 00:33:02.710
node, we're doing for another,
as we add it, as we go through.

00:33:04.977 --> 00:33:06.310
And that's a good question, too.

00:33:06.310 --> 00:33:09.010
Is the last node we
add becoming the first?

00:33:09.010 --> 00:33:10.100
That is also true.

00:33:10.100 --> 00:33:11.980
So here if we were to say, Hi!

00:33:11.980 --> 00:33:12.910
And then Hello!

00:33:12.910 --> 00:33:15.490
Well, Hello! becomes our
new first node, right?

00:33:15.490 --> 00:33:18.070
That's because we're trying to
insert new nodes at the front

00:33:18.070 --> 00:33:21.610
where the first node we
add becomes the first--

00:33:21.610 --> 00:33:25.510
sorry-- the last node we add becomes
the first node in our list, ultimately.

00:33:28.370 --> 00:33:30.340
OK.

00:33:30.340 --> 00:33:33.010
So there is one final
thing to do here, which

00:33:33.010 --> 00:33:35.320
is to make sure that if
we're asking for memory,

00:33:35.320 --> 00:33:38.630
we do want to make sure that this
memory is actually given to us.

00:33:38.630 --> 00:33:41.680
And before we go ahead and
use this pointer called n,

00:33:41.680 --> 00:33:44.540
we do want to make sure that we're
not getting back some null value.

00:33:44.540 --> 00:33:47.500
So if n is null, what
we want to do is simply

00:33:47.500 --> 00:33:51.820
print something like couldn't
allocate memory for node,

00:33:51.820 --> 00:33:54.670
or whatever you want to type
here, some descriptive error.

00:33:54.670 --> 00:33:59.560
And then we could return 1 to symbolize
that something went wrong here.

00:33:59.560 --> 00:34:00.060
OK.

00:34:02.890 --> 00:34:05.020
A question then, is this a stack?

00:34:05.020 --> 00:34:09.400
Well, it has the beginnings of
a stack where if you remember,

00:34:09.400 --> 00:34:12.790
a stack is this data structure
where the first thing we add

00:34:12.790 --> 00:34:14.920
becomes the last thing we get out.

00:34:14.920 --> 00:34:18.982
Or otherwise, the last thing we
add is the first thing we get out,

00:34:18.982 --> 00:34:19.940
which is the case here.

00:34:19.940 --> 00:34:23.170
So we're saying that the last
node, the last phrase we add,

00:34:23.170 --> 00:34:26.920
is going to be the first
node we see in our structure.

00:34:26.920 --> 00:34:29.710
But it also depends on how
we access this linked list.

00:34:29.710 --> 00:34:32.170
If we always access it
from the very first node,

00:34:32.170 --> 00:34:34.929
well, that could be kind
of similar to a stack.

00:34:34.929 --> 00:34:38.230
But if we always want to access it by
looking at the very last element first,

00:34:38.230 --> 00:34:40.150
well, that could be then a queue, right?

00:34:40.150 --> 00:34:42.820
So a queue in a stack kind of
depends on how you actually

00:34:42.820 --> 00:34:46.075
use them, and not so much what you
build kind of underneath the hood.

00:34:49.875 --> 00:34:50.375
All right.

00:34:53.760 --> 00:34:55.920
So let's keep going here.

00:34:55.920 --> 00:34:58.880
And I would say that this
code is mostly correct.

00:34:58.880 --> 00:35:02.420
But if I go down to my
terminal and I now do this,

00:35:02.420 --> 00:35:04.760
I do make list again just to compile it.

00:35:04.760 --> 00:35:07.045
And then I say, let's
try to memory check this

00:35:07.045 --> 00:35:08.670
and see if there are any memory errors.

00:35:08.670 --> 00:35:11.330
Well, I could do valgrind./list.

00:35:11.330 --> 00:35:16.310
And let me type in, let's say,
once it loads, I can type in Hi!

00:35:16.310 --> 00:35:17.630
Right?

00:35:17.630 --> 00:35:19.190
And now that seems OK.

00:35:19.190 --> 00:35:20.930
I can type in Hello!

00:35:20.930 --> 00:35:25.250
And my program ends, but it seems
I've definitely lost some bytes.

00:35:25.250 --> 00:35:27.290
I've definitely lost 16 bytes.

00:35:27.290 --> 00:35:30.110
And I've indirectly lost 16 bytes.

00:35:30.110 --> 00:35:31.680
So what happened here?

00:35:31.680 --> 00:35:33.680
What did I fail to do in my code?

00:35:36.128 --> 00:35:37.420
Yeah, I didn't free the memory.

00:35:37.420 --> 00:35:42.570
So if I look back at my program, if I
do this and I say, OK, well, unload.

00:35:42.570 --> 00:35:45.240
I trust that unload
will work for me, right?

00:35:45.240 --> 00:35:47.550
But if I actually look down
here, it's still a to do.

00:35:47.550 --> 00:35:50.290
I didn't actually free any nodes here.

00:35:50.290 --> 00:35:53.400
And if I look above, well,
I used malloc several times.

00:35:53.400 --> 00:35:54.420
I used it twice.

00:35:54.420 --> 00:35:57.430
And I didn't call free
any of those times.

00:35:57.430 --> 00:36:00.280
So every time we call malloc, we
want to make sure we call free,

00:36:00.280 --> 00:36:02.613
and that's what unload is
going to do for us down below.

00:36:02.613 --> 00:36:06.090
We're making our own function that
takes in some pointer to a list

00:36:06.090 --> 00:36:09.120
and should go through and
free every node in that list.

00:36:09.120 --> 00:36:10.995
We can give back that
memory to the computer.

00:36:10.995 --> 00:36:13.370
So let's visualize, if we're
actually writing it in code.

00:36:13.370 --> 00:36:15.570
If we go back to our
slides here, we could

00:36:15.570 --> 00:36:19.200
think about actually trying
to free this linked list.

00:36:19.200 --> 00:36:21.610
And maybe your initial
thought is, let's do this.

00:36:21.610 --> 00:36:23.027
Let's actually just free the list.

00:36:23.027 --> 00:36:23.970
Let's say this.

00:36:23.970 --> 00:36:25.740
But why would this be bad?

00:36:25.740 --> 00:36:30.300
If we say free the
list, what goes wrong?

00:36:30.300 --> 00:36:32.370
Here's the original picture.

00:36:32.370 --> 00:36:34.110
Here's freeing the list.

00:36:34.110 --> 00:36:35.640
What happened here?

00:36:40.010 --> 00:36:42.140
Yes, I'm saying we lost the other nodes.

00:36:42.140 --> 00:36:47.410
So notice how in this picture, if you
go back, well, the very first node

00:36:47.410 --> 00:36:49.570
is still pointing to that next node.

00:36:49.570 --> 00:36:52.510
And the only reason we know
where that next node is located

00:36:52.510 --> 00:36:55.690
is because it's in the next
portion of our very first node.

00:36:55.690 --> 00:37:00.940
But if we free that, if we get rid of
it, well, where is that second node?

00:37:00.940 --> 00:37:01.890
We don't know.

00:37:01.890 --> 00:37:03.807
So if that's going to
be a problem for us,

00:37:03.807 --> 00:37:06.390
what we don't want to actually
just free the head of the list,

00:37:06.390 --> 00:37:10.230
we want to make sure we're going
through and freeing nodes step by step

00:37:10.230 --> 00:37:13.840
and keeping track of where
the rest of our nodes are.

00:37:13.840 --> 00:37:16.735
So instead, one way we could
approach this is as follows.

00:37:16.735 --> 00:37:19.110
We don't want to do this, just
free the head of the list.

00:37:19.110 --> 00:37:22.110
What we could do, instead, is
say, let's make a new pointer,

00:37:22.110 --> 00:37:23.527
let's add that to the equation.

00:37:23.527 --> 00:37:24.110
Let's do this.

00:37:24.110 --> 00:37:27.450
So let's say that-- let's make a
new pointer, just called maybe ptr,

00:37:27.450 --> 00:37:28.570
for example.

00:37:28.570 --> 00:37:33.550
And that will point to whatever the
head of our list is, the next node.

00:37:33.550 --> 00:37:37.290
So now, a pointer is pointing
to that second node in our list.

00:37:37.290 --> 00:37:38.992
And now, we could safely free list.

00:37:38.992 --> 00:37:39.700
We could do this.

00:37:39.700 --> 00:37:42.480
We could say, OK, free
up list, and that's gone.

00:37:42.480 --> 00:37:49.390
But what is still pointing to the
rest of our list now, which variable?

00:37:49.390 --> 00:37:52.140
Maybe in the chat?

00:37:52.140 --> 00:37:52.640
Yeah.

00:37:52.640 --> 00:37:55.080
So ptr is pointing to that
second node in our list.

00:37:55.080 --> 00:37:57.170
So we still have this in our memory.

00:37:57.170 --> 00:38:00.250
We know where all the
rest of our nodes are.

00:38:00.250 --> 00:38:00.750
OK.

00:38:00.750 --> 00:38:01.580
So that's all good.

00:38:01.580 --> 00:38:04.855
But now, we want to actually
get back to the head of our list

00:38:04.855 --> 00:38:07.230
and make sure that we're
actually kind of moving forward.

00:38:07.230 --> 00:38:10.910
So what we'll do is say that
list gets the value for ptr.

00:38:10.910 --> 00:38:14.030
So we're going to update what our
list looks like and say that we're

00:38:14.030 --> 00:38:16.410
going to move this over here.

00:38:16.410 --> 00:38:19.550
So now, the head of our list points
to whatever pointer points to.

00:38:19.550 --> 00:38:21.470
And we could, perhaps,
do this in a loop.

00:38:21.470 --> 00:38:25.940
We could say, OK, in this case, well,
pointer will get list arrow next again.

00:38:25.940 --> 00:38:29.950
And what would pointer be now, though?

00:38:29.950 --> 00:38:33.250
If pointer is list arrow next,
what is pointer going to be?

00:38:36.490 --> 00:38:38.950
It will be the second node of
the node after the first one.

00:38:38.950 --> 00:38:40.970
But what's the actual value right now?

00:38:40.970 --> 00:38:41.470
Yeah.

00:38:41.470 --> 00:38:45.460
It'll be null, or [INAUDIBLE]
because notice how right now list

00:38:45.460 --> 00:38:48.220
is pointing to this
very, now, the first node

00:38:48.220 --> 00:38:50.900
in our list, whose next value is null.

00:38:50.900 --> 00:38:54.670
So if pointer equals list arrow
next, well, pointer will be null.

00:38:54.670 --> 00:38:56.870
OK, that seems OK for now.

00:38:56.870 --> 00:39:01.840
But if we wanted to kind of keep going,
let's say we free list, all right.

00:39:01.840 --> 00:39:04.150
And then we say, well,
list gets pointer.

00:39:04.150 --> 00:39:05.125
What is list now?

00:39:07.830 --> 00:39:11.705
There's nothing left, so list
would also be null, right?

00:39:11.705 --> 00:39:13.830
So what we've done, we've
kind of gone full circle.

00:39:13.830 --> 00:39:17.820
We've made-- we've gone from the very
first node to our very last node,

00:39:17.820 --> 00:39:20.880
and we've ended when
list equals null, when

00:39:20.880 --> 00:39:25.020
there's nothing left to free, back at,
for example, the same place we started.

00:39:25.020 --> 00:39:28.590
If we go back to our code, we could
see that the very place we began

00:39:28.590 --> 00:39:32.040
was with list equaling
no, no nodes in our list.

00:39:32.040 --> 00:39:38.760
And now, if we unload them, we also want
to end in that very same place, right?

00:39:38.760 --> 00:39:39.703
OK.

00:39:39.703 --> 00:39:42.870
So let's try implementing this in code
now so we can get rid of those memory

00:39:42.870 --> 00:39:43.370
errors.

00:39:43.370 --> 00:39:47.070
If we go down below here, we're going
to free all of our allocated nodes.

00:39:47.070 --> 00:39:49.570
And we know we want some
kind of loop to do this.

00:39:49.570 --> 00:39:51.598
But let's hold off on that for now.

00:39:51.598 --> 00:39:54.390
We know, first, we want to make
this new pointer to our next nodes.

00:39:54.390 --> 00:39:55.015
We can do this.

00:39:55.015 --> 00:40:00.180
Node star pointer equals whatever
list is pointing to and getting

00:40:00.180 --> 00:40:02.400
that next value there.

00:40:02.400 --> 00:40:06.230
Then we could safely free list because
pointers pointing the rest of our list.

00:40:06.230 --> 00:40:09.800
And, finally, we could
say, list updates to 0.2,

00:40:09.800 --> 00:40:12.610
what pointer is now pointing to.

00:40:12.610 --> 00:40:15.080
But how long do we want to do this?

00:40:15.080 --> 00:40:18.170
How do we know when to stop?

00:40:18.170 --> 00:40:21.120
Maybe in the chat?

00:40:21.120 --> 00:40:25.221
This seems good for one node,
but want to keep doing this.

00:40:25.221 --> 00:40:28.190
Yeah, we want to keep doing
it until our list is null.

00:40:28.190 --> 00:40:30.680
And we don't really know how
many times we want to iterate,

00:40:30.680 --> 00:40:32.630
so a while loop seems
like a good choice.

00:40:32.630 --> 00:40:35.880
A while loop we can do something
until something else is true.

00:40:35.880 --> 00:40:40.190
So we could say, while
list doesn't equal null.

00:40:40.190 --> 00:40:43.340
While we still have
nodes, essentially, let's

00:40:43.340 --> 00:40:47.960
go ahead and make sure that we
have a pointer to the next node.

00:40:47.960 --> 00:40:50.240
Let's free the current
node, and then let's

00:40:50.240 --> 00:40:53.280
update the current node to point
to whatever it was previously,

00:40:53.280 --> 00:40:54.590
the next one.

00:40:54.590 --> 00:40:57.503
So we could do something like this.

00:40:57.503 --> 00:41:00.170
And now, if we run our code, if
we go back to our terminal here,

00:41:00.170 --> 00:41:02.550
we'll make list.

00:41:02.550 --> 00:41:06.040
We'll run valgrind on ./list.

00:41:06.040 --> 00:41:12.990
And what do we get if we type in,
let's say, Hi! here, and then Hello?

00:41:12.990 --> 00:41:15.930
We see, successfully, all
heap blocks were freed.

00:41:15.930 --> 00:41:19.460
No leaks are possible down below here.

00:41:19.460 --> 00:41:23.880
So what questions are there on
this, if you look at our code

00:41:23.880 --> 00:41:25.290
for unloading here?

00:41:30.420 --> 00:41:32.640
Yeah, a good question,
is nil the same as no?

00:41:32.640 --> 00:41:36.210
So if we look at our terminal
and we actually run this again,

00:41:36.210 --> 00:41:40.800
call it a make list, I'll
run ./list, and I'll say Hi!

00:41:40.800 --> 00:41:44.280
And notice how this very first
node has the next pointer of nil.

00:41:44.280 --> 00:41:46.690
Nil is simply the terminal
representation of null.

00:41:46.690 --> 00:41:48.795
So nil means null in the terminal.

00:41:51.640 --> 00:41:53.260
All right.

00:41:53.260 --> 00:41:57.140
And now, I'm seeing another question,
is this a recursive function?

00:41:57.140 --> 00:42:00.370
So this still is probably
not a recursive function,

00:42:00.370 --> 00:42:04.720
because a recursive function would kind
of call itself like this function might

00:42:04.720 --> 00:42:07.520
then call unload somewhere in here.

00:42:07.520 --> 00:42:10.135
So it's not quite recursive,
but you are correct in noticing

00:42:10.135 --> 00:42:12.010
that this is a good
application for recursion

00:42:12.010 --> 00:42:16.220
because we're doing the very same
thing for every node as we go.

00:42:16.220 --> 00:42:18.670
So you could certainly
write this recursively.

00:42:18.670 --> 00:42:21.430
Here, we're going to
iteratively using a loop.

00:42:25.500 --> 00:42:26.520
OK.

00:42:26.520 --> 00:42:27.630
Other questions on this?

00:42:35.430 --> 00:42:35.930
OK.

00:42:35.930 --> 00:42:38.550
So let's keep going on.

00:42:38.550 --> 00:42:39.950
And we've seen linked lists now.

00:42:39.950 --> 00:42:41.810
We've seen these really
two core operations

00:42:41.810 --> 00:42:44.960
for linked lists, like, how
do we add some data to them,

00:42:44.960 --> 00:42:47.360
and how do we unload them?

00:42:47.360 --> 00:42:50.318
What you might ask, though,
is how can we speed this up?

00:42:50.318 --> 00:42:51.860
And that's where hash tables come in.

00:42:51.860 --> 00:42:55.037
So hash tables are simply
an array of linked lists.

00:42:55.037 --> 00:42:57.370
We take linked lists, we break
them into smaller pieces.

00:42:57.370 --> 00:42:59.120
And so let's get a
visual for that and see

00:42:59.120 --> 00:43:00.990
if we can make this run even faster.

00:43:00.990 --> 00:43:06.260
So if we go back to our slides here
and we see this idea of a linked list,

00:43:06.260 --> 00:43:07.370
we have maybe, Hey!

00:43:07.370 --> 00:43:08.060
Hello!

00:43:08.060 --> 00:43:08.840
Lo there!

00:43:08.840 --> 00:43:12.560
Different kind of activation
words for our voice assistant.

00:43:12.560 --> 00:43:15.890
Well, what we could do is try
to make sure we can access

00:43:15.890 --> 00:43:18.470
certain parts of this linked
list faster to make sure

00:43:18.470 --> 00:43:20.870
we can actually search
this linked list faster.

00:43:20.870 --> 00:43:22.580
And we could for that use a hash table.

00:43:22.580 --> 00:43:25.890
So recall from earlier that a
hash table looks a bit like this.

00:43:25.890 --> 00:43:28.310
We have some array on
the left-hand side.

00:43:28.310 --> 00:43:31.370
And we have different, what
we call, buckets in here,

00:43:31.370 --> 00:43:35.180
where a bucket might, in this
case, be a letter of the alphabet.

00:43:35.180 --> 00:43:38.750
And that letter of the alphabet is
maybe the first character in a word.

00:43:38.750 --> 00:43:43.850
You could have other buckets, other ways
of assigning these phrases to buckets.

00:43:43.850 --> 00:43:46.400
But for now, let's focus
on this alphabetical hash

00:43:46.400 --> 00:43:48.740
function here and
alphabetical hash table.

00:43:48.740 --> 00:43:52.640
So with this, we have different
letters for every hash bucket.

00:43:52.640 --> 00:44:00.560
And how does C know where in
the array H, I, J, K and L are?

00:44:00.560 --> 00:44:03.090
If we wanted to add some nodes
to a certain bucket, well,

00:44:03.090 --> 00:44:05.380
how do we figure out
where that bucket is?

00:44:05.380 --> 00:44:07.540
Anyone in the chat want
to chime in on that?

00:44:11.670 --> 00:44:12.170
Yes.

00:44:12.170 --> 00:44:13.400
So it seems to be that they're ordered.

00:44:13.400 --> 00:44:15.483
They're certainly in the
right alphabetical order.

00:44:15.483 --> 00:44:17.470
That's helpful.

00:44:17.470 --> 00:44:21.790
Other things, too, for how we know
where H is located, or L is located?

00:44:24.950 --> 00:44:28.350
Yes, we could use maybe
ASCII numbers and so on.

00:44:28.350 --> 00:44:30.560
And we're on the right track here.

00:44:30.560 --> 00:44:34.160
But because if you look at the
left-hand side, this is just an array,

00:44:34.160 --> 00:44:38.420
and we've kind of ourselves as
humans assigned those array buckets,

00:44:38.420 --> 00:44:41.030
those array segments, some letter.

00:44:41.030 --> 00:44:44.000
But we have to assign them some
number, really, some index.

00:44:44.000 --> 00:44:48.930
And so you might recall from very early
on, all arrays have these indices,

00:44:48.930 --> 00:44:49.430
right?

00:44:49.430 --> 00:44:52.340
So maybe zero to however
big your array is.

00:44:52.340 --> 00:44:56.540
And now we have, in this case,
maybe H corresponds to the 7 index

00:44:56.540 --> 00:45:00.410
in our table, or I corresponds
to the 8 index, and so on.

00:45:00.410 --> 00:45:04.550
And so in those indexes, we can
actually add in our linked list.

00:45:04.550 --> 00:45:09.110
And so here, if we notice
that H corresponds to 7,

00:45:09.110 --> 00:45:12.500
well, H is the eighth
letter of the alphabet.

00:45:12.500 --> 00:45:15.020
But if we were counting from
zero, it would be the seventh.

00:45:15.020 --> 00:45:19.100
So in this case, H corresponds to zero
as that seventh letter of the alphabet

00:45:19.100 --> 00:45:22.010
counting from zero here.

00:45:22.010 --> 00:45:26.340
That's how we know how we actually
add in these linked list nodes.

00:45:26.340 --> 00:45:31.060
Now, if I were to give you some new
phrase, not Hey! or Hello! or Lo there!

00:45:31.060 --> 00:45:35.290
How would you figure out
where to put it in this case?

00:45:35.290 --> 00:45:37.920
Maybe in the chat?

00:45:37.920 --> 00:45:42.500
If I gave you some new
phrase, what could you

00:45:42.500 --> 00:45:44.300
do to figure out where to put it?

00:45:49.560 --> 00:45:50.060
Yeah.

00:45:50.060 --> 00:45:51.730
So I'm seeing maybe at the start.

00:45:51.730 --> 00:45:54.140
I don't really know until
we see the phrase, right?

00:45:54.140 --> 00:45:57.450
And that's where this idea
of a hash function comes in.

00:45:57.450 --> 00:45:59.750
So in order to figure
out where Hey! and Hello!

00:45:59.750 --> 00:46:04.670
and Lo there! go, we need to have some
kind of function that takes our phrase

00:46:04.670 --> 00:46:07.320
and gives us back some number
that tells us where to put it.

00:46:07.320 --> 00:46:09.200
So here we have, maybe, the phrase Hey!

00:46:09.200 --> 00:46:11.150
And our hash function
should ideally give us

00:46:11.150 --> 00:46:14.910
some number that tells us
where to put that given phrase.

00:46:14.910 --> 00:46:17.960
So if I gave you not Hey!
or Hi! or Hello there,

00:46:17.960 --> 00:46:20.360
maybe I gave you the name Carter.

00:46:20.360 --> 00:46:23.630
Well, that should give us some
new number that then corresponds

00:46:23.630 --> 00:46:27.140
to an index in our array, or
we could start this new linked

00:46:27.140 --> 00:46:30.420
list that might have all
the C words, for example.

00:46:30.420 --> 00:46:33.140
So let's actually take a look
at this kind of in code now.

00:46:33.140 --> 00:46:38.990
If we go back to our Visual Studio code
over here, I'll open up this new file,

00:46:38.990 --> 00:46:40.250
this one called table.

00:46:40.250 --> 00:46:42.590
So I'll code table.c.

00:46:42.590 --> 00:46:44.310
And it's already a bit done for me.

00:46:44.310 --> 00:46:46.680
But notice how we have
a few different things.

00:46:46.680 --> 00:46:50.540
We have the same node structure from
before for making our linked lists.

00:46:50.540 --> 00:46:54.498
And now, we have this new
array of pointers to node.

00:46:54.498 --> 00:46:55.790
And we're calling this a table.

00:46:55.790 --> 00:46:59.570
And notice how we have
26 buckets, 26 places

00:46:59.570 --> 00:47:01.370
in which to have a linked list now.

00:47:01.370 --> 00:47:04.070
And we can figure out
where these new phrases

00:47:04.070 --> 00:47:09.330
should go by running this function
called hash that we'll see down below.

00:47:09.330 --> 00:47:10.650
So let's scroll down.

00:47:10.650 --> 00:47:13.100
We see that, here, we have
this function called hash.

00:47:13.100 --> 00:47:15.350
And hash is right now
just returning zero.

00:47:15.350 --> 00:47:18.290
Because if I give it a new phrase,
I haven't really finished this yet.

00:47:18.290 --> 00:47:20.660
I just want to return zero
for every phrase I get.

00:47:20.660 --> 00:47:24.590
And ideally, if I run this
program, I should type in a phrase

00:47:24.590 --> 00:47:29.330
and see the index that I should place
this phrase inside of my hash table,

00:47:29.330 --> 00:47:33.920
and then print out, OK, this phrase
hash is to whatever number I have.

00:47:33.920 --> 00:47:35.010
So let's see an example.

00:47:35.010 --> 00:47:38.820
So if I go to my terminal now,
I could make this full screen

00:47:38.820 --> 00:47:42.120
and I could simply type, make a table.

00:47:42.120 --> 00:47:46.200
And now, I'll run ./table,
and I'll type in Hi!

00:47:46.200 --> 00:47:48.330
And I get zero, which is maybe OK.

00:47:48.330 --> 00:47:50.400
But let's try now--

00:47:50.400 --> 00:47:54.830
let's try Lo there! as we
saw before, and we get zero.

00:47:54.830 --> 00:47:57.200
So why would this be bad?

00:47:57.200 --> 00:48:01.700
Besides it being unfinished, why
would we not want Hi! and Lo there!

00:48:01.700 --> 00:48:03.380
to kind of hash to that same value?

00:48:07.830 --> 00:48:09.090
Any ideas in the chat?

00:48:16.180 --> 00:48:16.680
Yeah.

00:48:16.680 --> 00:48:20.683
So I'm seeing that this is kind
of similar to a linked list,

00:48:20.683 --> 00:48:21.600
if we were to do this.

00:48:21.600 --> 00:48:23.220
Like, if we have--

00:48:23.220 --> 00:48:24.840
back to our visual here.

00:48:24.840 --> 00:48:30.527
If we have some array of linked
lists, but every word hashes to zero,

00:48:30.527 --> 00:48:32.985
well, we have all this other
space to put our linked lists,

00:48:32.985 --> 00:48:35.935
but we're just putting everything
in the zero bucket, which it makes--

00:48:35.935 --> 00:48:39.060
turns us back into this picture, which
is just a single linked list, right?

00:48:39.060 --> 00:48:42.098
The goal of this hash function is
to have many shorter linked lists,

00:48:42.098 --> 00:48:43.890
and that's the goal of
trying to put things

00:48:43.890 --> 00:48:46.210
in different buckets,
different indexes here.

00:48:46.210 --> 00:48:49.650
So ideally, we don't want to have
everything hash to the same value,

00:48:49.650 --> 00:48:52.620
we want to have different hash
values for different phrases

00:48:52.620 --> 00:48:54.630
that we might input.

00:48:54.630 --> 00:48:57.390
So one way to do this, as
we're seeing right now,

00:48:57.390 --> 00:49:03.120
is trying to make sure we have different
linked lists that depend on the very

00:49:03.120 --> 00:49:05.165
first letter of the phrase we add.

00:49:05.165 --> 00:49:08.290
And let's go back to our hash function
and figure out how we could do that.

00:49:08.290 --> 00:49:10.570
So let's exit our program here.

00:49:10.570 --> 00:49:12.310
And let's go ahead and go back here.

00:49:12.310 --> 00:49:18.780
And let's try to implement hash so that
we get a value 0-25 for a given phrase.

00:49:18.780 --> 00:49:19.780
So here, let's try this.

00:49:19.780 --> 00:49:24.030
Let's actually try
returning not just zero,

00:49:24.030 --> 00:49:29.010
but let's say, depending on the
very first letter of our phrase,

00:49:29.010 --> 00:49:31.790
we return that and see how that goes.

00:49:31.790 --> 00:49:34.720
So here we have returning
phrase, bracket zero.

00:49:34.720 --> 00:49:36.310
And we'll go back to our terminal now.

00:49:36.310 --> 00:49:44.680
We'll say make table, and then
./table, and let's try Hello!

00:49:44.680 --> 00:49:46.540
And we see 72.

00:49:46.540 --> 00:49:48.430
Is that-- I mean, it's better.

00:49:48.430 --> 00:49:49.970
Right now, I have actual number.

00:49:49.970 --> 00:49:54.510
And if I type in Carter, well, that's
a different number, which is good.

00:49:54.510 --> 00:49:57.460
And Lo there! perhaps, is
also a different number.

00:49:57.460 --> 00:49:59.610
So we're off to a good start.

00:49:59.610 --> 00:50:01.710
But what's wrong with this?

00:50:01.710 --> 00:50:06.240
If I get 72, and 67, and
76, I go back to my code

00:50:06.240 --> 00:50:09.570
and I look at my table, what's wrong?

00:50:13.800 --> 00:50:14.300
Yeah.

00:50:14.300 --> 00:50:16.500
So I'm seeing I only have 26 buckets.

00:50:16.500 --> 00:50:21.080
So if I wanted to, let's say, get
a phrase and put it in my bucket

00:50:21.080 --> 00:50:25.160
somewhere, well, I can't really do
that because maybe Hello! hashes to,

00:50:25.160 --> 00:50:26.540
what was it if we look at it?

00:50:26.540 --> 00:50:27.650
72.

00:50:27.650 --> 00:50:30.840
Well, that's not a
valid index in my table.

00:50:30.840 --> 00:50:33.830
So I need to actually make
this shorter a little bit.

00:50:33.830 --> 00:50:37.520
And one way to do that is
maybe just subtract A from it.

00:50:37.520 --> 00:50:38.970
So I could do something like this.

00:50:38.970 --> 00:50:41.540
And what this will do
is, maybe if you recall

00:50:41.540 --> 00:50:47.290
how A maps to 65, well, if I get A
as the very first character, well,

00:50:47.290 --> 00:50:49.850
what is 65?

00:50:49.850 --> 00:50:52.100
A minus A is zero.

00:50:52.100 --> 00:50:54.770
If I get B, B is 66 in ASCII.

00:50:54.770 --> 00:50:57.560
So B minus A is 1.

00:50:57.560 --> 00:51:02.000
And if I get all the way down to Z,
well, Z minus a is going to be 25.

00:51:02.000 --> 00:51:04.740
And if I wrap back around to A,
well, that's just zero again.

00:51:04.740 --> 00:51:07.580
So what I've done now is I've
made sure that every index I get

00:51:07.580 --> 00:51:11.940
is going to be inside the kind
of length of my table here.

00:51:11.940 --> 00:51:15.500
I won't get any indexes that
are out of bounds for this list.

00:51:15.500 --> 00:51:21.330
And now, if I kind of go back here,
we run this code, I'll do make table.

00:51:21.330 --> 00:51:26.580
I'll run ./table, and I'll do Hello!
now, and I get 7, which is better.

00:51:26.580 --> 00:51:28.470
Now, I'll do maybe Lo there!

00:51:28.470 --> 00:51:29.490
and I get 11.

00:51:29.490 --> 00:51:33.670
And now, I'll type in just Z to
see how that works, and I get 25.

00:51:33.670 --> 00:51:35.550
So that seems to be
pretty good for myself.

00:51:35.550 --> 00:51:39.520
But what still am I missing here?

00:51:39.520 --> 00:51:42.340
What could go wrong?

00:51:42.340 --> 00:51:43.060
Any ideas?

00:51:49.220 --> 00:51:49.720
Yes.

00:51:49.720 --> 00:51:52.240
So lowercased letters--
right now, I'm working

00:51:52.240 --> 00:51:53.830
with phrases that are all capital.

00:51:53.830 --> 00:51:56.538
But let's say someone wants to
add a phrase that is in lowercase.

00:51:56.538 --> 00:52:00.230
So I could actually open a
terminal again, make table.

00:52:00.230 --> 00:52:04.740
I'll do ./table, and I'll do
maybe hello! in lowercase.

00:52:04.740 --> 00:52:07.920
And I get 39, which
isn't quite what I want.

00:52:07.920 --> 00:52:09.800
So what could I do instead?

00:52:09.800 --> 00:52:13.538
Maybe any ideas in the chat?

00:52:13.538 --> 00:52:14.455
What can I do instead?

00:52:19.360 --> 00:52:21.920
I could, perhaps, maybe
standardize on capital letters.

00:52:21.920 --> 00:52:25.930
So I could say, I want to make whatever
this first letter is uppercase,

00:52:25.930 --> 00:52:29.890
and then subtract that capital A. So if
I go back to my terminal, I could say,

00:52:29.890 --> 00:52:32.520
make table.

00:52:32.520 --> 00:52:34.050
I could say ./table.

00:52:34.050 --> 00:52:38.310
And now I want for it to be the case
that lowercase hello and capital

00:52:38.310 --> 00:52:40.060
Hello! hash that same value.

00:52:40.060 --> 00:52:45.430
So I'll type hello here, and capital H
Hello! here, and I get 7 in both cases.

00:52:45.430 --> 00:52:48.550
So that seems to work for me now.

00:52:48.550 --> 00:52:51.270
All right.

00:52:51.270 --> 00:52:56.240
Any other-- if we go back
to our terminal here,

00:52:56.240 --> 00:52:59.540
that will tell us where
to put individual phrases.

00:52:59.540 --> 00:53:04.190
So questions on this
hash function so far?

00:53:12.760 --> 00:53:13.260
All right.

00:53:15.990 --> 00:53:18.520
So not seeing too many questions here.

00:53:18.520 --> 00:53:19.820
Why don't we do this?

00:53:19.820 --> 00:53:24.490
So let's go ahead and think about
what makes for a good hash function?

00:53:24.490 --> 00:53:29.490
So if we think about this as
our very first hash function,

00:53:29.490 --> 00:53:33.418
this idea of putting nodes
into individual buckets,

00:53:33.418 --> 00:53:35.710
well, it's not going to be
quite what we want it to be.

00:53:35.710 --> 00:53:37.627
And if we actually go
through down here and we

00:53:37.627 --> 00:53:39.420
think about how could
we try to make sure

00:53:39.420 --> 00:53:42.180
we actually better hash function
than this, what we could do

00:53:42.180 --> 00:53:45.640
is actually make sure we have a good
distribution of values across it.

00:53:45.640 --> 00:53:48.510
So if you think about
our buckets here, we

00:53:48.510 --> 00:53:51.930
want to try to have-- to make sure
that every node in our linked list

00:53:51.930 --> 00:53:54.510
is kind of going to go
in every bucket here.

00:53:54.510 --> 00:53:58.050
Like, we actually have a lot of I words,
a lot of J words, a lot of K words.

00:53:58.050 --> 00:54:01.110
And here, the way we hash this
isn't going to quite do that for us.

00:54:01.110 --> 00:54:04.350
We're not going to actually have any
words that start with I, or J, or K,

00:54:04.350 --> 00:54:09.130
so we probably think of a new
hash function to use in this case.

00:54:09.130 --> 00:54:13.183
And if we go back over here, we
can think of some good points

00:54:13.183 --> 00:54:14.100
for our hash function.

00:54:14.100 --> 00:54:18.000
We can think of, OK, we always try to
get the same value for a given input.

00:54:18.000 --> 00:54:20.880
We always want to produce an even
distribution across the buckets,

00:54:20.880 --> 00:54:23.970
and we always want to make sure that
we use all the buckets that we're

00:54:23.970 --> 00:54:26.300
given in this case.

00:54:26.300 --> 00:54:28.488
So these are some ideas
for a good hash function.

00:54:28.488 --> 00:54:30.280
Here, what we'll actually
do is let you all

00:54:30.280 --> 00:54:34.927
go off and write your own hash functions
to go off and build your own overall.

00:54:34.927 --> 00:54:36.760
So if we think of some
other ideas here, you

00:54:36.760 --> 00:54:40.150
could use the length of the phrases,
you could make sure you maybe

00:54:40.150 --> 00:54:42.170
add up their values, and so on.

00:54:42.170 --> 00:54:44.830
But what we could do is
go ahead and have you all

00:54:44.830 --> 00:54:47.920
do that for the rest of this week now.

00:54:47.920 --> 00:54:51.940
And I think that will probably
conclude us for this section.

00:54:51.940 --> 00:54:53.690
It is so good to see you all here today.

00:54:53.690 --> 00:54:54.898
Thank you all for joining us.

00:54:54.898 --> 00:54:57.660
We will see you next week.