WEBVTT
X-TIMESTAMP-MAP=LOCAL:00:00:00.000,MPEGTS:900000

00:00:00.000 --> 00:00:03.486
[MUSIC PLAYING]

00:01:07.345 --> 00:01:10.960
TOM CRUISE: I'm going
to show you some magic.

00:01:10.960 --> 00:01:12.250
It's the real thing.

00:01:12.250 --> 00:01:14.420
[LAUGHTER]

00:01:14.420 --> 00:01:24.340
I mean, it's all the real thing.

00:01:24.340 --> 00:01:26.270
[LAUGHTER]

00:01:26.270 --> 00:01:27.410
DAVID J. MALAN: All right.

00:01:27.410 --> 00:01:30.950
This is CS50, Harvard
University's Introduction

00:01:30.950 --> 00:01:33.140
to the Intellectual
Enterprises of Computer Science

00:01:33.140 --> 00:01:34.430
and the Art of Programming.

00:01:34.430 --> 00:01:37.760
My name is David Malan, and this
is our family-friendly introduction

00:01:37.760 --> 00:01:41.780
to artificial intelligence or AI, which
seems to be everywhere these days.

00:01:41.780 --> 00:01:45.140
But first, a word on these
rubber ducks, which your students

00:01:45.140 --> 00:01:46.487
might have had for some time.

00:01:46.487 --> 00:01:49.320
Within the world of computer science,
and programming in particular,

00:01:49.320 --> 00:01:52.145
there's this notion of rubber
duck debugging or rubber ducking--

00:01:52.145 --> 00:01:57.080
--whereby in the absence of a colleague,
a friend, a family member, a teaching

00:01:57.080 --> 00:02:00.120
fellow who might be able to answer
your questions about your code,

00:02:00.120 --> 00:02:02.210
especially when it's
not working, ideally you

00:02:02.210 --> 00:02:04.940
might have at least a rubber
duck or really any inanimate

00:02:04.940 --> 00:02:07.550
object on your desk with whom to talk.

00:02:07.550 --> 00:02:11.243
And the idea is, that in expressing your
logic, talking through your problems,

00:02:11.243 --> 00:02:13.160
even though the duck
doesn't actually respond,

00:02:13.160 --> 00:02:16.250
invariably, you hear eventually
the illogic in your thoughts

00:02:16.250 --> 00:02:18.110
and the proverbial light bulb goes off.

00:02:18.110 --> 00:02:20.900
Now, for students online
for some time, CS50

00:02:20.900 --> 00:02:23.370
has had a digital
version thereof, whereby

00:02:23.370 --> 00:02:25.945
in the programming environment
that CS50 students use,

00:02:25.945 --> 00:02:29.070
for the past several years, if they
don't have a rubber duck on their desk,

00:02:29.070 --> 00:02:30.790
they can pull up this interface here.

00:02:30.790 --> 00:02:32.850
And if they begin a
conversation like, I'm

00:02:32.850 --> 00:02:35.850
hoping you can help me solve
some problem, up until recently,

00:02:35.850 --> 00:02:39.640
CS50's virtual rubber duck
would simply quack once, twice,

00:02:39.640 --> 00:02:41.010
or three times in total.

00:02:41.010 --> 00:02:43.380
But we have anecdotal
evidence that alone

00:02:43.380 --> 00:02:47.010
was enough to get students to realize
what it is they were doing wrong.

00:02:47.010 --> 00:02:51.090
But of course, more recently has
this duck and so many other ducks,

00:02:51.090 --> 00:02:53.340
so to speak, around the
world, come to life really.

00:02:53.340 --> 00:02:56.310
And your students have been
using artificial intelligence

00:02:56.310 --> 00:03:00.090
in some form within CS50 as
a virtual teaching assistant.

00:03:00.090 --> 00:03:02.130
And what we'll do today,
is reveal not only

00:03:02.130 --> 00:03:05.370
how we've been using and
leveraging AI within CS50,

00:03:05.370 --> 00:03:10.530
but also how AI itself works, and to
prepare you better for the years ahead.

00:03:10.530 --> 00:03:14.910
So last year around this time,
like DALL-E 2 and image generation

00:03:14.910 --> 00:03:15.870
were all of the rage.

00:03:15.870 --> 00:03:18.600
You might have played with this, whereby
you can type in some keywords and boom,

00:03:18.600 --> 00:03:20.640
you have a dynamically generated image.

00:03:20.640 --> 00:03:24.240
Similar tools are like Midjourney,
which gives you even more realistic 3D

00:03:24.240 --> 00:03:24.960
imagery.

00:03:24.960 --> 00:03:27.840
And within that world
of image generation,

00:03:27.840 --> 00:03:32.370
there were nonetheless some tells,
like an observant viewer could tell

00:03:32.370 --> 00:03:34.768
that this was probably generated by AI.

00:03:34.768 --> 00:03:36.810
And in fact, a few months
ago, The New York Times

00:03:36.810 --> 00:03:38.470
took a look at some of these tools.

00:03:38.470 --> 00:03:41.550
And so, for instance, here
is a sequence of images

00:03:41.550 --> 00:03:44.350
that at least at left, isn't
all that implausible that this

00:03:44.350 --> 00:03:45.600
might be an actual photograph.

00:03:45.600 --> 00:03:48.000
But in fact, all three of
these are AI-generated.

00:03:48.000 --> 00:03:50.910
And for some time, there
was a certain tell.

00:03:50.910 --> 00:03:54.600
Like AI up until recently, really
wasn't really good at the finer details,

00:03:54.600 --> 00:03:57.120
like the fingers are not quite right.

00:03:57.120 --> 00:03:58.950
And so you could have that sort of hint.

00:03:58.950 --> 00:04:01.470
But I dare say, AI is getting
even better and better,

00:04:01.470 --> 00:04:04.420
such that it's getting harder to
discern these kinds of things.

00:04:04.420 --> 00:04:06.930
So if you haven't already, go
ahead and take out your phone

00:04:06.930 --> 00:04:08.190
if you have one with you.

00:04:08.190 --> 00:04:11.680
And if you'd like to partake,
scan this barcode here,

00:04:11.680 --> 00:04:13.830
which will lead you to a URL.

00:04:13.830 --> 00:04:17.339
And on your screen, you'll have an
opportunity in a moment to buzz in.

00:04:17.339 --> 00:04:20.310
If my colleague, Rongxin, wouldn't
mind joining me up here on stage.

00:04:20.310 --> 00:04:22.560
We'll ask you a sequence of
questions and see just how

00:04:22.560 --> 00:04:25.480
prepared you are for
this coming world of AI.

00:04:25.480 --> 00:04:27.823
So for instance, once
you've got this here,

00:04:27.823 --> 00:04:29.490
code scanned, if you don't, that's fine.

00:04:29.490 --> 00:04:32.880
You can play along at home or
alongside the person next to you.

00:04:32.880 --> 00:04:34.920
Here are two images.

00:04:34.920 --> 00:04:38.400
And my question for you is,
which of these two images, left

00:04:38.400 --> 00:04:42.610
or right, was generated by AI?

00:04:42.610 --> 00:04:49.740
Which of these two was
generated by AI, left or right?

00:04:49.740 --> 00:04:51.780
And I think Rongxin, we
can flip over and see

00:04:51.780 --> 00:04:53.970
as the responses start to come in.

00:04:53.970 --> 00:04:58.740
So far, we're about 20% saying
left, 70 plus percent saying right.

00:04:58.740 --> 00:05:02.272
3%, 4%, comfortably admitting
unsure, and that's fine.

00:05:02.272 --> 00:05:04.230
Let's wait for a few more
responses to come in,

00:05:04.230 --> 00:05:06.837
though I think the
right-hand folks have it.

00:05:06.837 --> 00:05:09.420
And let's go ahead and flip back
and see what the solution is.

00:05:09.420 --> 00:05:14.020
In this case, it was, in fact, the
right-hand side that was AI-generated.

00:05:14.020 --> 00:05:15.127
So, that's great.

00:05:15.127 --> 00:05:17.460
I'm not sure what it means
that we figured this one out,

00:05:17.460 --> 00:05:19.350
but let's try one more here.

00:05:19.350 --> 00:05:22.558
So let me propose that we
consider now these two images.

00:05:22.558 --> 00:05:23.350
It's the same code.

00:05:23.350 --> 00:05:25.680
So if you still have your phone
up, you don't need to scan again.

00:05:25.680 --> 00:05:27.250
It's going to be the same URL here.

00:05:27.250 --> 00:05:28.650
But just in case you closed it.

00:05:28.650 --> 00:05:30.990
Let's take a look now
at these two images.

00:05:30.990 --> 00:05:35.040
Which of these, left or
right, was AI-generated?

00:05:35.040 --> 00:05:38.802
Left or right this time?

00:05:38.802 --> 00:05:41.010
Rongxin, should we take a
look at how it's coming in?

00:05:41.010 --> 00:05:42.570
Oh, it's a little closer this time.

00:05:42.570 --> 00:05:44.540
Left or right?

00:05:44.540 --> 00:05:46.830
Right's losing a little
ground, maybe as people

00:05:46.830 --> 00:05:48.930
are changing their answers to left.

00:05:48.930 --> 00:05:52.510
More people are unsure this time,
which is somewhat revealing.

00:05:52.510 --> 00:05:54.790
Let's give folks another second or two.

00:05:54.790 --> 00:05:57.200
And Rongxin, should we flip back?

00:05:57.200 --> 00:06:00.760
The answer is, actually a trick
question, since they were both AI.

00:06:00.760 --> 00:06:04.120
So most of you, most of
you were, in fact, right.

00:06:04.120 --> 00:06:08.150
But if you take a glance at this,
is getting really, really good.

00:06:08.150 --> 00:06:13.220
And so this is just a taste of the
images that we might see down the line.

00:06:13.220 --> 00:06:16.930
And in fact, that video
with which we began,

00:06:16.930 --> 00:06:20.440
Tom Cruise, as you might have
gleaned, was not, in fact, Tom Cruise.

00:06:20.440 --> 00:06:22.810
That was an example of
a deepfake, a video that

00:06:22.810 --> 00:06:26.500
was synthesized, whereby a different
human was acting out those motions,

00:06:26.500 --> 00:06:31.660
saying those words, but software,
artificial intelligence-inspired

00:06:31.660 --> 00:06:35.380
software was mutating the actual
image and faking this video.

00:06:35.380 --> 00:06:38.950
So it's all fun and games for now as
we tinker with these kinds of examples,

00:06:38.950 --> 00:06:43.000
but suffice it to say, as we've begun
to discuss in classes like this already,

00:06:43.000 --> 00:06:46.240
disinformation is only going to become
more challenging in a world where

00:06:46.240 --> 00:06:47.920
it's not just text, but it's imagery.

00:06:47.920 --> 00:06:49.452
And all the more, soon video.

00:06:49.452 --> 00:06:51.910
But for today, we'll focus
really on the fundamentals, what

00:06:51.910 --> 00:06:56.230
it is that's enabling technologies like
these, and even more familiarly, text

00:06:56.230 --> 00:06:57.970
generation, which is all the rage.

00:06:57.970 --> 00:07:01.240
And in fact, it seems just a few months
ago, probably everyone in this room

00:07:01.240 --> 00:07:04.030
started to hear about
tools like ChatGPT.

00:07:04.030 --> 00:07:06.800
So we thought we'd do one
final exercise here as a group.

00:07:06.800 --> 00:07:08.800
And this was another piece
in The New York Times

00:07:08.800 --> 00:07:11.590
where they asked the audience,
"Did a fourth grader write this?

00:07:11.590 --> 00:07:12.850
Or the new chatbot?"

00:07:12.850 --> 00:07:15.640
So another opportunity to
assess your discerning skills.

00:07:15.640 --> 00:07:16.450
So same URL.

00:07:16.450 --> 00:07:19.840
So if you still have your phone
open and that same interface open,

00:07:19.840 --> 00:07:21.470
you're in the right place.

00:07:21.470 --> 00:07:25.480
And here, we'll take a final
stab at two essays of sorts.

00:07:25.480 --> 00:07:30.020
Which of these essays was written by AI?

00:07:30.020 --> 00:07:32.260
Essay 1 or Essay 2?

00:07:32.260 --> 00:07:34.450
And as folks buzz in,
I'll read the first.

00:07:34.450 --> 00:07:35.020
Essay 1.

00:07:35.020 --> 00:07:37.870
I like to bring a yummy sandwich
and a cold juice box for lunch.

00:07:37.870 --> 00:07:41.860
Sometimes I'll even pack a tasty piece
of fruit or a bag of crunchy chips.

00:07:41.860 --> 00:07:46.090
As we eat, we chat and laugh and catch
up on each other's day, dot, dot, dot.

00:07:46.090 --> 00:07:46.690
Essay 2.

00:07:46.690 --> 00:07:49.243
My mother packs me a sandwich,
a drink, fruit, and a treat.

00:07:49.243 --> 00:07:51.910
When I get in the lunchroom, I
find an empty table and sit there

00:07:51.910 --> 00:07:52.930
and I eat my lunch.

00:07:52.930 --> 00:07:54.820
My friends come and sit down with me.

00:07:54.820 --> 00:07:55.790
Dot, dot, dot.

00:07:55.790 --> 00:07:57.550
Rongxin, should we see what folks think?

00:07:57.550 --> 00:08:03.040
It looks like most of you think
that Essay 1 was generated by AI.

00:08:03.040 --> 00:08:09.010
And in fact, if we flip back to the
answer here, it was, in fact, Essay 1.

00:08:09.010 --> 00:08:13.060
So it's great that we now already
have seemingly this discerning eye,

00:08:13.060 --> 00:08:15.880
but let me perhaps
deflate that enthusiasm

00:08:15.880 --> 00:08:20.120
by saying it's only going to get
harder to discern one from the other.

00:08:20.120 --> 00:08:23.680
And we're really now on the bleeding
edge of what's soon to be possible.

00:08:23.680 --> 00:08:25.990
But most everyone in this
room has probably by now

00:08:25.990 --> 00:08:31.450
seen, tried, certainly heard of ChatGPT,
which is all about textual generation.

00:08:31.450 --> 00:08:34.210
Within CS50 and within
academia more generally,

00:08:34.210 --> 00:08:37.690
have we been thinking about, talking
about, how whether to use or not

00:08:37.690 --> 00:08:39.023
use these kinds of technologies.

00:08:39.023 --> 00:08:42.148
And if the students in the room haven't
told the family members in the room

00:08:42.148 --> 00:08:45.010
already, this here is an excerpt
from CS50's own syllabus this year,

00:08:45.010 --> 00:08:48.730
whereby we have deemed tools like
ChatGPT in their current form,

00:08:48.730 --> 00:08:49.808
just too helpful.

00:08:49.808 --> 00:08:51.850
Sort of like an overzealous
friend who in school,

00:08:51.850 --> 00:08:55.520
who just wants to give you all of the
answers instead of leading you to them.

00:08:55.520 --> 00:09:00.760
And so we simply prohibit by
policy using AI-based software,

00:09:00.760 --> 00:09:05.200
such as ChatGPT, third-party tools like
GitHub Copilot, Bing Chat, and others

00:09:05.200 --> 00:09:08.920
that suggests or completes answers
to questions or lines of code.

00:09:08.920 --> 00:09:13.510
But it would seem reactionary to
take away what technology surely has

00:09:13.510 --> 00:09:15.400
some potential upsides for education.

00:09:15.400 --> 00:09:18.460
And so within CS50 this semester,
as well as this past summer,

00:09:18.460 --> 00:09:22.300
have we allowed students to use
CS50's own AI-based software, which

00:09:22.300 --> 00:09:24.490
are in effect, as we'll
discuss, built on top

00:09:24.490 --> 00:09:27.700
of these third-party
tools, ChatGPT from OpenAI,

00:09:27.700 --> 00:09:29.440
companies like Microsoft and beyond.

00:09:29.440 --> 00:09:33.820
And in fact, what students can now
use, is this brought to life CS50 duck,

00:09:33.820 --> 00:09:37.270
or DDB, Duck Debugger,
within a website of our own,

00:09:37.270 --> 00:09:41.230
CS50 AI, and another that your
students know known as cs50.dev.

00:09:41.230 --> 00:09:43.210
So students are using
it, but in a way where

00:09:43.210 --> 00:09:46.120
we have tempered the enthusiasm
of what might otherwise

00:09:46.120 --> 00:09:48.370
be an overly helpful
duck to model it more

00:09:48.370 --> 00:09:50.480
akin to a good teacher,
a good teaching fellow,

00:09:50.480 --> 00:09:54.140
who might guide you to the answers,
but not simply hand them outright.

00:09:54.140 --> 00:09:57.170
So what does that actually mean, and
in what form does this duck come?

00:09:57.170 --> 00:09:59.960
Well, architecturally, for those of
you with engineering backgrounds that

00:09:59.960 --> 00:10:02.293
might be curious as to how
this is actually implemented,

00:10:02.293 --> 00:10:06.260
if a student here in the class has
a question, virtually in this case,

00:10:06.260 --> 00:10:10.820
they somehow ask these questions of
this central web application, cs50.ai.

00:10:10.820 --> 00:10:13.760
But we, in turn, have
built much of our own logic

00:10:13.760 --> 00:10:18.050
on top of third-party services, known
as APIs, application programming

00:10:18.050 --> 00:10:20.780
interfaces, features that
other companies provide

00:10:20.780 --> 00:10:22.530
that people like us can use.

00:10:22.530 --> 00:10:25.250
So as they are doing really
a lot of the heavy lifting,

00:10:25.250 --> 00:10:27.380
the so-called large
language models are there.

00:10:27.380 --> 00:10:30.350
But we, too, have information
that is not in these models yet.

00:10:30.350 --> 00:10:32.720
For instance, the words
that came out of my mouth

00:10:32.720 --> 00:10:36.500
just last week when we had a lecture
on some other topic, not to mention all

00:10:36.500 --> 00:10:39.270
of the past lectures and homework
assignments from this year.

00:10:39.270 --> 00:10:41.510
So we have our own
vector database locally

00:10:41.510 --> 00:10:44.570
via which we can search for
more recent information,

00:10:44.570 --> 00:10:47.900
and then hand some of that information
into these models, which you might

00:10:47.900 --> 00:10:51.870
recall, at least for OpenAI,
is cut off as of 2021 as

00:10:51.870 --> 00:10:54.240
of now, to make the
information even more current.

00:10:54.240 --> 00:10:56.590
So architecturally,
that's sort of the flow.

00:10:56.590 --> 00:10:58.980
But for now, I thought I'd
share at a higher level what

00:10:58.980 --> 00:11:01.440
it is your students are
already familiar with,

00:11:01.440 --> 00:11:04.230
and what will soon be more broadly
available to our own students

00:11:04.230 --> 00:11:05.650
online as well.

00:11:05.650 --> 00:11:08.190
So what we focused on
is, what's generally

00:11:08.190 --> 00:11:11.820
now known as prompt engineering,
which isn't really a technical phrase,

00:11:11.820 --> 00:11:14.500
because it's not so much engineering
in the traditional sense.

00:11:14.500 --> 00:11:16.650
It really is just English,
what we are largely

00:11:16.650 --> 00:11:20.520
doing when it comes to
giving the AI the personality

00:11:20.520 --> 00:11:22.800
of a good teacher or a good duck.

00:11:22.800 --> 00:11:26.460
So what we're doing, is giving it what's
known as a system prompt nowadays,

00:11:26.460 --> 00:11:31.020
whereby we write some English sentences,
send those English sentences to OpenAI

00:11:31.020 --> 00:11:34.560
or Microsoft, that sort of
teaches it how to behave.

00:11:34.560 --> 00:11:36.930
Not just using its own
knowledge out of the box,

00:11:36.930 --> 00:11:40.290
but coercing it to behave a little
more educationally constructively.

00:11:40.290 --> 00:11:42.720
And so for instance, a
representative snippet

00:11:42.720 --> 00:11:44.622
of English that we
provide to these services

00:11:44.622 --> 00:11:46.080
looks a little something like this.

00:11:46.080 --> 00:11:50.600
Quote, unquote, "You are a friendly and
supportive teaching assistant for CS50.

00:11:50.600 --> 00:11:52.520
You are also a rubber duck.

00:11:52.520 --> 00:11:57.080
You answer student questions only about
CS50 and the field of computer science,

00:11:57.080 --> 00:11:59.900
do not answer questions
about unrelated topics.

00:11:59.900 --> 00:12:02.060
Do not provide full
answers to problem sets,

00:12:02.060 --> 00:12:04.130
as this would violate academic honesty.

00:12:04.130 --> 00:12:07.610
And so in essence, and you can
do this manually with ChatGPT,

00:12:07.610 --> 00:12:09.990
you can tell it or ask it how to behave.

00:12:09.990 --> 00:12:11.910
We, essentially, are
doing this automatically,

00:12:11.910 --> 00:12:14.240
so that it doesn't just
hand answers out of the box

00:12:14.240 --> 00:12:16.310
and knows a little
something more about us.

00:12:16.310 --> 00:12:19.310
There's also in this world of AI
right now the notion of a user

00:12:19.310 --> 00:12:21.380
prompt versus that system prompt.

00:12:21.380 --> 00:12:25.060
And the user prompt, in our case, is
essentially the student's own question.

00:12:25.060 --> 00:12:29.630
I have a question about x, or I have
a problem with my code here in y,

00:12:29.630 --> 00:12:32.720
so we pass to those same
APIs, students' own questions

00:12:32.720 --> 00:12:34.670
as part of this so-called user prompt.

00:12:34.670 --> 00:12:37.490
Just so you're familiar now with
some of the vernacular of late.

00:12:37.490 --> 00:12:39.200
Now, the programming
environment that students

00:12:39.200 --> 00:12:41.575
have been using this whole
year is known as Visual Studio

00:12:41.575 --> 00:12:45.260
Code, a popular open source,
free product, that most--

00:12:45.260 --> 00:12:47.450
so many engineers around
the world now use.

00:12:47.450 --> 00:12:50.580
But we've instrumented it to be
a little more course-specific

00:12:50.580 --> 00:12:55.830
with some course-specific features that
make learning within this environment

00:12:55.830 --> 00:12:57.900
all the easier.

00:12:57.900 --> 00:12:59.220
It lives at cs50.dev.

00:12:59.220 --> 00:13:02.370
And as students in this
room know, that as of now,

00:13:02.370 --> 00:13:04.650
the virtual duck lives
within this environment

00:13:04.650 --> 00:13:07.540
and can do things like explain
highlighted lines of code.

00:13:07.540 --> 00:13:10.560
So here, for instance, is a screenshot
of this programming environment.

00:13:10.560 --> 00:13:14.550
Here is some arcane looking code in
a language called C, that we've just

00:13:14.550 --> 00:13:16.082
left behind us in the class.

00:13:16.082 --> 00:13:19.290
And suppose that you don't understand
what one or more of these lines of code

00:13:19.290 --> 00:13:19.790
do.

00:13:19.790 --> 00:13:23.580
Students can now highlight those lines,
right-click or Control click on it,

00:13:23.580 --> 00:13:26.440
select explain highlighted
code, and voila,

00:13:26.440 --> 00:13:32.040
they see a ChatGPT-like explanation of
that very code within a second or so,

00:13:32.040 --> 00:13:35.100
that no human has typed out, but
that's been dynamically generated

00:13:35.100 --> 00:13:36.660
based on this code.

00:13:36.660 --> 00:13:39.450
Other things that the duck
can now do for students

00:13:39.450 --> 00:13:42.960
is advise students on how to improve
their code style, the aesthetics,

00:13:42.960 --> 00:13:44.260
the formatting thereof.

00:13:44.260 --> 00:13:47.280
And so for instance, here is
similar code in a language called C.

00:13:47.280 --> 00:13:48.990
And I'll stipulate that it's very messy.

00:13:48.990 --> 00:13:51.840
Everything is left-aligned
instead of nicely indented,

00:13:51.840 --> 00:13:53.490
so it looks a little more structured.

00:13:53.490 --> 00:13:54.870
Students can now click a button.

00:13:54.870 --> 00:13:56.820
They'll see at the
right-hand side in green

00:13:56.820 --> 00:13:58.650
how their code should ideally look.

00:13:58.650 --> 00:14:01.470
And if they're not quite sure
what those changes are or why,

00:14:01.470 --> 00:14:03.150
they can click on, explain changes.

00:14:03.150 --> 00:14:06.180
And similarly, the duck
advises them on how and why

00:14:06.180 --> 00:14:08.970
to turn their not great
code into greater code,

00:14:08.970 --> 00:14:11.250
from left to right respectively.

00:14:11.250 --> 00:14:15.450
More compellingly and more generalizable
beyond CS50 and beyond computer

00:14:15.450 --> 00:14:19.080
science, is AI's ability to
answer most of the questions

00:14:19.080 --> 00:14:20.820
that students might now ask online.

00:14:20.820 --> 00:14:24.540
And we've been doing asynchronous Q&A
for years via various mobile or web

00:14:24.540 --> 00:14:25.710
applications and the like.

00:14:25.710 --> 00:14:28.680
But to date, it has been
humans, myself included,

00:14:28.680 --> 00:14:30.780
responding to all of those questions.

00:14:30.780 --> 00:14:34.650
Now the duck has an opportunity to chime
in, generally within three seconds,

00:14:34.650 --> 00:14:37.260
because we've integrated
it into an online Q&A tool

00:14:37.260 --> 00:14:40.960
that students in CS50 and elsewhere
across Harvard have long used.

00:14:40.960 --> 00:14:44.370
So here's an anonymized screenshot
of a question from an actual student,

00:14:44.370 --> 00:14:47.370
but written here as John
Harvard, who asked this summer,

00:14:47.370 --> 00:14:50.150
in the summer version of
CS50, what is flask exactly?

00:14:50.150 --> 00:14:51.920
So fairly definitional question.

00:14:51.920 --> 00:14:55.250
And here is what the duck spit
out, thanks to that architecture

00:14:55.250 --> 00:14:56.510
I described before.

00:14:56.510 --> 00:14:59.210
I'll stipulate that this is
correct, but it is mostly

00:14:59.210 --> 00:15:02.820
a definition, akin to what Google or
Bing could already give you last year.

00:15:02.820 --> 00:15:04.940
But here's a more nuanced
question, for instance,

00:15:04.940 --> 00:15:06.800
from another anonymized student.

00:15:06.800 --> 00:15:10.160
In this question here, the
student's including an error message

00:15:10.160 --> 00:15:11.000
that they're seeing.

00:15:11.000 --> 00:15:12.650
They're asking about that.

00:15:12.650 --> 00:15:15.890
And they're asking a little more
broadly and qualitatively, is there

00:15:15.890 --> 00:15:19.640
a more efficient way to write this
code, a question that really is best

00:15:19.640 --> 00:15:21.620
answered based on experience.

00:15:21.620 --> 00:15:25.130
Here, I'll stipulate that the duck
responded with this answer, which

00:15:25.130 --> 00:15:26.480
is actually pretty darn good.

00:15:26.480 --> 00:15:29.630
Not only responding in English,
but with some sample starter code

00:15:29.630 --> 00:15:31.430
that would make sense in this context.

00:15:31.430 --> 00:15:34.580
And at the bottom it's worth noting,
because none of this technology

00:15:34.580 --> 00:15:37.850
is perfect just yet, it's still
indeed very bleeding edge,

00:15:37.850 --> 00:15:41.960
and so what we have chosen to do within
CS50 is include disclaimers, like this.

00:15:41.960 --> 00:15:44.090
I am an experimental bot, quack.

00:15:44.090 --> 00:15:46.820
Do not assume that my reply is
accurate unless you see that it's

00:15:46.820 --> 00:15:50.040
been endorsed by humans, quack.

00:15:50.040 --> 00:15:53.160
And in fact, at top right, the
mechanism we've been using in this tool

00:15:53.160 --> 00:15:54.510
is usually within minutes.

00:15:54.510 --> 00:15:57.690
A human, whether it's a teaching
fellow, a course assistant, or myself,

00:15:57.690 --> 00:16:00.990
will click on a button like this
to signal to our human students

00:16:00.990 --> 00:16:05.130
that yes, like the duck is spot on here,
or we have an opportunity, as always,

00:16:05.130 --> 00:16:07.020
to chime in with our own responses.

00:16:07.020 --> 00:16:09.770
Frankly, that disclaimer, that
button, will soon I do think

00:16:09.770 --> 00:16:11.770
go away, as the software
gets better and better.

00:16:11.770 --> 00:16:14.367
But for now, that's how
we're modulating exactly

00:16:14.367 --> 00:16:16.200
what students' expectations
might be when it

00:16:16.200 --> 00:16:19.395
comes to correctness or incorrectness.

00:16:19.395 --> 00:16:22.020
It's common too in programming,
to see a lot of error messages,

00:16:22.020 --> 00:16:24.210
certainly when you're
learning first-hand.

00:16:24.210 --> 00:16:26.820
A lot of these error messages
are arcane, confusing,

00:16:26.820 --> 00:16:29.310
certainly to students, versus
the people who wrote them.

00:16:29.310 --> 00:16:31.170
Soon students will see a box like this.

00:16:31.170 --> 00:16:34.050
Whenever one of their
terminal window programs errs,

00:16:34.050 --> 00:16:39.120
they'll be assisted too with
English-like, TF-like support when

00:16:39.120 --> 00:16:42.212
it comes to explaining what it is
that went wrong with that command.

00:16:42.212 --> 00:16:43.920
And ultimately, what
this is really doing

00:16:43.920 --> 00:16:45.900
for students in our
own experience already,

00:16:45.900 --> 00:16:49.830
is providing them really with
virtual office hours and 24/7,

00:16:49.830 --> 00:16:52.560
which is actually quite compelling
in a university environment,

00:16:52.560 --> 00:16:55.110
where students' schedules
are already tightly packed,

00:16:55.110 --> 00:16:58.270
be it with academics, their
curriculars, athletics, and the like--

00:16:58.270 --> 00:17:00.180
--and they might have
enough time to dive

00:17:00.180 --> 00:17:03.510
into a homework assignment, maybe eight
hours even, for something sizable.

00:17:03.510 --> 00:17:06.390
But if they hit that wall
a couple of hours in, yeah,

00:17:06.390 --> 00:17:10.020
they can go to office hours or they can
ask a question asynchronously online,

00:17:10.020 --> 00:17:13.020
but it's really not optimal
in the moment support

00:17:13.020 --> 00:17:15.150
that we can now provide
all the more effectively

00:17:15.150 --> 00:17:17.170
we hope, through software, as well.

00:17:17.170 --> 00:17:18.089
So if you're curious.

00:17:18.089 --> 00:17:20.797
Even if you're not a technophile
yourself, anyone on the internet

00:17:20.797 --> 00:17:24.000
can go to cs50.ai and experiment
with this user interface.

00:17:24.000 --> 00:17:29.940
This one here actually resembles ChatGPT
itself, but it's specific to CS50.

00:17:29.940 --> 00:17:31.980
And here again is just a
sequence of screenshots

00:17:31.980 --> 00:17:33.930
that I'll stipulate
for today's purposes,

00:17:33.930 --> 00:17:37.920
are pretty darn good in akin to what I
myself or a teaching fellow would reply

00:17:37.920 --> 00:17:41.100
to and answer to a student's
question, in this case,

00:17:41.100 --> 00:17:42.930
about their particular code.

00:17:42.930 --> 00:17:45.240
And ultimately, it's
really aspirational.

00:17:45.240 --> 00:17:49.320
The goal here ultimately is to really
approximate a one-to-one teacher

00:17:49.320 --> 00:17:52.950
to student ratio, which despite all
of the resources we within CS50,

00:17:52.950 --> 00:17:56.070
we within Harvard and
places like Yale have,

00:17:56.070 --> 00:17:58.650
we certainly have never
had enough resources

00:17:58.650 --> 00:18:00.690
to approximate what
might really be ideal,

00:18:00.690 --> 00:18:04.050
which is more of an apprenticeship
model, a mentorship, whereby it's just

00:18:04.050 --> 00:18:06.145
you and that teacher working one-to-one.

00:18:06.145 --> 00:18:09.270
Now we still have humans, and the goal
is not to reduce that human support,

00:18:09.270 --> 00:18:14.220
but to focus it all the more consciously
on the students who would benefit most

00:18:14.220 --> 00:18:17.100
from some impersonal one-to-one
support versus students

00:18:17.100 --> 00:18:21.433
who would happily take it at any hour
of the day more digitally via online.

00:18:21.433 --> 00:18:23.850
And in fact, we're still in
the process of evaluating just

00:18:23.850 --> 00:18:25.560
how well or not well all of this works.

00:18:25.560 --> 00:18:28.800
But based on our summer experiment
alone with about 70 students

00:18:28.800 --> 00:18:31.770
a few months back, one student
wrote us at term's end it--

00:18:31.770 --> 00:18:33.660
--"felt like having a personal tutor.

00:18:33.660 --> 00:18:37.830
I love how AI bots will answer questions
without ego and without judgment.

00:18:37.830 --> 00:18:40.260
Generally entertaining even
the stupidest of questions

00:18:40.260 --> 00:18:42.690
without treating them
like they're stupid.

00:18:42.690 --> 00:18:47.550
It has an, as one could expect,"
ironically, "an inhuman level

00:18:47.550 --> 00:18:48.450
of patience."

00:18:48.450 --> 00:18:51.870
And so I thought that's telling
as to how even one student is

00:18:51.870 --> 00:18:54.490
perceiving these new possibilities.

00:18:54.490 --> 00:18:56.610
So let's consider now
more academically what

00:18:56.610 --> 00:18:58.920
it is that's enabling those
kinds of tools, not just

00:18:58.920 --> 00:19:02.370
within CS50, within computer science,
but really, the world more generally.

00:19:02.370 --> 00:19:04.078
What the whole world's
been talking about

00:19:04.078 --> 00:19:06.270
is generative artificial intelligence.

00:19:06.270 --> 00:19:09.630
AI that can generate images,
generate text, and sort of

00:19:09.630 --> 00:19:12.820
mimic the behavior of
what we think of as human.

00:19:12.820 --> 00:19:14.240
So what does that really mean?

00:19:14.240 --> 00:19:15.990
Well, let's start
really at the beginning.

00:19:15.990 --> 00:19:19.170
Artificial intelligence is
actually a technique, a technology,

00:19:19.170 --> 00:19:21.510
a subject that's actually
been with us for some time,

00:19:21.510 --> 00:19:26.460
but it really was the introduction of
this very user-friendly interface known

00:19:26.460 --> 00:19:28.230
as ChatGPT.

00:19:28.230 --> 00:19:31.440
And some of the more recent academic
work over really just the past five

00:19:31.440 --> 00:19:35.010
or six years, that really allowed
us to take a massive leap forward

00:19:35.010 --> 00:19:38.520
it would seem technologically, as
to what these things can now do.

00:19:38.520 --> 00:19:40.330
So what is artificial intelligence?

00:19:40.330 --> 00:19:43.410
It's been with us for some time,
and it's honestly, so omnipresent,

00:19:43.410 --> 00:19:45.690
that we take it for granted nowadays.

00:19:45.690 --> 00:19:48.330
Gmail, Outlook, have gotten
really good at spam detection.

00:19:48.330 --> 00:19:50.020
If you haven't checked your
spam folder in a while,

00:19:50.020 --> 00:19:52.000
that's testament to
just how good they seem

00:19:52.000 --> 00:19:54.758
to be at getting it out of your inbox.

00:19:54.758 --> 00:19:57.050
Handwriting recognition has
been with us for some time.

00:19:57.050 --> 00:19:59.380
I dare say, it, too, is only
getting better and better

00:19:59.380 --> 00:20:02.920
the more the software is able to
adapt to different handwriting

00:20:02.920 --> 00:20:04.270
styles, such as this.

00:20:04.270 --> 00:20:06.940
Recommendation histories
and the like, whether you're

00:20:06.940 --> 00:20:09.190
using Netflix or any
other service, have gotten

00:20:09.190 --> 00:20:12.580
better and better at recommending
things you might like based on things

00:20:12.580 --> 00:20:14.920
you have liked, and
maybe based on things

00:20:14.920 --> 00:20:18.190
other people who like the same
thing as you might have liked.

00:20:18.190 --> 00:20:20.560
And suffice it to say,
there's no one at Netflix

00:20:20.560 --> 00:20:22.780
akin to the old VHS
stores of yesteryear,

00:20:22.780 --> 00:20:26.590
who are recommending to you
specifically what movie you might like.

00:20:26.590 --> 00:20:31.330
And there's no code, no algorithm that
says, if they like x, then recommend y,

00:20:31.330 --> 00:20:34.762
else recommend z, because there's just
too many movies, too many people, too

00:20:34.762 --> 00:20:36.220
many different tastes in the world.

00:20:36.220 --> 00:20:40.000
So AI is increasingly sort of looking
for patterns that might not even

00:20:40.000 --> 00:20:42.700
be obvious to us humans,
and dynamically figuring out

00:20:42.700 --> 00:20:46.750
what might be good for me, for
you or you, or anyone else.

00:20:46.750 --> 00:20:50.402
Siri, Google Assistant, Alexa, any
of these voice recognition tools

00:20:50.402 --> 00:20:51.610
that are answering questions.

00:20:51.610 --> 00:20:54.918
That, too, suffice it to
say, is all powered by AI.

00:20:54.918 --> 00:20:58.210
But let's start with something a little
simpler than any of those applications.

00:20:58.210 --> 00:21:01.522
And this is one of the first arcade
games from yesteryear known as Pong.

00:21:01.522 --> 00:21:02.980
And it's sort of like table tennis.

00:21:02.980 --> 00:21:05.440
And the person on the left can
move their paddle up and down.

00:21:05.440 --> 00:21:07.000
Person on the right can do the same.

00:21:07.000 --> 00:21:09.970
And the goal is to get the
ball past the other person,

00:21:09.970 --> 00:21:13.960
or conversely, make sure it hits
your paddle and bounces back.

00:21:13.960 --> 00:21:17.440
Well, somewhat simpler than this
insofar as it can be one player,

00:21:17.440 --> 00:21:19.275
is another Atari game
from yesteryear known

00:21:19.275 --> 00:21:21.400
as Breakout, whereby you're
essentially just trying

00:21:21.400 --> 00:21:24.460
to bang the ball against the
bricks to get more and more points

00:21:24.460 --> 00:21:26.320
and get rid of all of those bricks.

00:21:26.320 --> 00:21:28.960
But all of us in this room
probably have a human instinct

00:21:28.960 --> 00:21:32.800
for how to win this game, or
at least how to play this game.

00:21:32.800 --> 00:21:36.430
For instance, if the ball
pictured here back in the '80s

00:21:36.430 --> 00:21:41.530
as a single red dot just left the
paddle, pictured here as a red line,

00:21:41.530 --> 00:21:43.990
where is the ball
presumably going to go next?

00:21:43.990 --> 00:21:47.410
And in turn, which direction
should I slide my paddle?

00:21:47.410 --> 00:21:49.900
To the left or to the right?

00:21:49.900 --> 00:21:51.630
So presumably, to the left.

00:21:51.630 --> 00:21:54.690
And we all have an eye for what seemed
to be the digital physics of that.

00:21:54.690 --> 00:21:57.540
And indeed, that would then
be an algorithm, sort of step

00:21:57.540 --> 00:21:59.890
by step instructions for
solving some problem.

00:21:59.890 --> 00:22:03.120
So how can we now translate that human
intuition to what we describe more

00:22:03.120 --> 00:22:04.780
as artificial intelligence?

00:22:04.780 --> 00:22:07.290
Not nearly as sophisticated
as those other applications,

00:22:07.290 --> 00:22:09.000
but we'll indeed,
start with some basics.

00:22:09.000 --> 00:22:12.960
You might know from economics or
strategic thinking or computer science,

00:22:12.960 --> 00:22:15.640
this idea of a decision tree
that allows you to decide,

00:22:15.640 --> 00:22:19.060
should I go this way or this way
when it comes to making a decision.

00:22:19.060 --> 00:22:22.440
So let's consider how we could draw
a picture to represent even something

00:22:22.440 --> 00:22:24.180
simplistic like Breakout.

00:22:24.180 --> 00:22:28.290
Well, if the ball is left of the paddle,
is a question or a Boolean expression

00:22:28.290 --> 00:22:29.940
I might ask myself in code.

00:22:29.940 --> 00:22:34.500
If yes, then I should move my paddle
left, as most everyone just said.

00:22:34.500 --> 00:22:37.960
Else, if the ball is not left
of paddle, what do I want to do?

00:22:37.960 --> 00:22:39.537
Well, I want to ask a question.

00:22:39.537 --> 00:22:41.370
I don't want to just
instinctively go right.

00:22:41.370 --> 00:22:44.010
I want to check, is the ball
to the right of the paddle,

00:22:44.010 --> 00:22:47.730
and if yes, well, then yes, go
ahead and move the paddle right.

00:22:47.730 --> 00:22:50.180
But there is a third
situation, which is--

00:22:50.180 --> 00:22:51.163
AUDIENCE: [INAUDIBLE]

00:22:51.163 --> 00:22:52.080
DAVID J. MALAN: Right.

00:22:52.080 --> 00:22:53.920
Like, don't move, it's
coming right at you.

00:22:53.920 --> 00:22:55.260
So that would be the
third scenario here.

00:22:55.260 --> 00:22:58.140
No, it's not to the right or to the
left, so just don't move the paddle.

00:22:58.140 --> 00:23:00.660
You got lucky, and it's coming,
for instance, straight down.

00:23:00.660 --> 00:23:04.170
So Breakout is fairly straightforward
when it comes to an algorithm.

00:23:04.170 --> 00:23:07.200
And we can actually translate this
as any CS50 student now could,

00:23:07.200 --> 00:23:11.400
to code or pseudocode, sort of
English-like code that's independent

00:23:11.400 --> 00:23:15.280
of Java, C, C++ and all of the
programming languages of today.

00:23:15.280 --> 00:23:17.940
So in English pseudocode,
while a game is

00:23:17.940 --> 00:23:22.230
ongoing, if the ball is left of
paddle, I should move paddle left.

00:23:22.230 --> 00:23:26.460
Else if ball is right of the paddle,
it should say paddle, that's a bug,

00:23:26.460 --> 00:23:29.520
not intended today, move paddle right.

00:23:29.520 --> 00:23:31.710
Else, don't move the paddle.

00:23:31.710 --> 00:23:35.910
So that, too, represents a
translation of this intuition to code

00:23:35.910 --> 00:23:37.200
that's very deterministic.

00:23:37.200 --> 00:23:40.830
You can anticipate all possible
scenarios captured in code.

00:23:40.830 --> 00:23:43.890
And frankly, this should be the
most boring game of Breakout,

00:23:43.890 --> 00:23:47.250
because the paddle should just
perfectly play this game, assuming

00:23:47.250 --> 00:23:49.770
there's no variables or
randomness when it comes to speed

00:23:49.770 --> 00:23:53.590
or angles or the like, which real
world games certainly try to introduce.

00:23:53.590 --> 00:23:55.570
But let's consider another
game from yesteryear

00:23:55.570 --> 00:23:58.570
that you might play with your kids
today or you did yourself growing up.

00:23:58.570 --> 00:23:59.590
Here's tic-tac-toe.

00:23:59.590 --> 00:24:02.860
And for those unfamiliar, the
goal is to get three O's in a row

00:24:02.860 --> 00:24:07.180
or three X's in a row, vertically,
horizontally, or diagonally.

00:24:07.180 --> 00:24:09.970
So suppose it's now X's turn.

00:24:09.970 --> 00:24:12.250
If you've played
tic-tac-toe, most of you

00:24:12.250 --> 00:24:16.060
probably just have an immediate instinct
as to where X should probably go,

00:24:16.060 --> 00:24:18.970
so that it doesn't lose instantaneously.

00:24:18.970 --> 00:24:22.690
But let's consider in the more general
case, how do you solve tic-tac-toe.

00:24:22.690 --> 00:24:25.360
Frankly, if you're in the
habit of losing tic-tac-toe,

00:24:25.360 --> 00:24:27.255
but you're not trying
to lose tic-tac-toe,

00:24:27.255 --> 00:24:28.630
you're actually playing it wrong.

00:24:28.630 --> 00:24:31.920
Like, you should minimally be able
to always force a tie in tic-tac-toe.

00:24:31.920 --> 00:24:34.420
And better yet, you should be
able to beat the other person.

00:24:34.420 --> 00:24:37.550
So hopefully, everyone now will
soon walk away with this strategy.

00:24:37.550 --> 00:24:41.020
So how can we borrow inspiration
from those same decision trees

00:24:41.020 --> 00:24:43.100
and do something similar here?

00:24:43.100 --> 00:24:47.620
So if you, the player, ask yourself,
can I get three in a row on this turn?

00:24:47.620 --> 00:24:51.970
Well, if yes, then you should do
that and play the X in that position.

00:24:51.970 --> 00:24:53.980
Play in the square to
get three in a row.

00:24:53.980 --> 00:24:54.820
Straight forward.

00:24:54.820 --> 00:24:58.330
If you can't get three in a row in this
turn, you should ask another question.

00:24:58.330 --> 00:25:01.660
Can my opponent get three
in a row in their next turn?

00:25:01.660 --> 00:25:06.220
Because then you better preempt
that by moving into that position.

00:25:06.220 --> 00:25:10.810
Play in the square to block
opponent's three in a row.

00:25:10.810 --> 00:25:13.428
What if though, that's
not the case, right?

00:25:13.428 --> 00:25:15.970
What if there aren't even that
many X's and O's on the board?

00:25:15.970 --> 00:25:17.887
If you're in the habit
of just kind of playing

00:25:17.887 --> 00:25:21.940
randomly, like you might not be
playing optimally as a good AI could.

00:25:21.940 --> 00:25:24.430
So if no, it's kind of a question mark.

00:25:24.430 --> 00:25:26.685
In fact, there's probably
more to this tree,

00:25:26.685 --> 00:25:28.810
because we could think
through, what if I go there.

00:25:28.810 --> 00:25:30.977
Wait a minute, what if I
go there or there or there?

00:25:30.977 --> 00:25:34.510
You can start to think a few steps ahead
as a computer could do much better even

00:25:34.510 --> 00:25:35.540
than us humans.

00:25:35.540 --> 00:25:37.388
So suppose, for instance, it's O's turn.

00:25:37.388 --> 00:25:39.430
Now those of you who are
very good at tic-tac-toe

00:25:39.430 --> 00:25:40.870
might have an instinct for where to go.

00:25:40.870 --> 00:25:42.953
But this is an even harder
problem, it would seem.

00:25:42.953 --> 00:25:45.370
I could go in eight
possible places if I'm O.

00:25:45.370 --> 00:25:49.570
But let's try to break that down
more algorithmically, as in AI would.

00:25:49.570 --> 00:25:53.830
And let's recognize, too, that with
games in particular, one of the reasons

00:25:53.830 --> 00:25:58.330
that AI was so early adopted in
these games, playing the CPU,

00:25:58.330 --> 00:26:02.020
is that games really lend
themselves to defining them,

00:26:02.020 --> 00:26:04.120
if taking the fun out
of it mathematically.

00:26:04.120 --> 00:26:07.600
Defining them in terms of inputs
and outputs, maybe paddle moving

00:26:07.600 --> 00:26:10.040
left or right, ball moving up or down.

00:26:10.040 --> 00:26:13.090
You can really quantize it
at a very boring low level.

00:26:13.090 --> 00:26:16.060
But that lends itself then
to solving it optimally.

00:26:16.060 --> 00:26:19.630
And in fact, with most games,
the goal is to maximize or maybe

00:26:19.630 --> 00:26:21.790
minimize some math function, right?

00:26:21.790 --> 00:26:24.910
Most games, if you have scores,
the goal is to maximize your score,

00:26:24.910 --> 00:26:26.750
and indeed, get a high score.

00:26:26.750 --> 00:26:31.510
So games lend themselves to a
nice translation to mathematics,

00:26:31.510 --> 00:26:33.410
and in turn here, AI solutions.

00:26:33.410 --> 00:26:37.690
So one of the first algorithms one
might learn in a class on algorithms

00:26:37.690 --> 00:26:39.490
and on artificial
intelligence is something

00:26:39.490 --> 00:26:41.860
called minimax, which alludes
to this idea of trying

00:26:41.860 --> 00:26:46.060
to minimize and/or maximize something
as your function, your goal.

00:26:46.060 --> 00:26:49.890
And it actually derives its inspiration
from these same decision trees

00:26:49.890 --> 00:26:51.140
that we've been talking about.

00:26:51.140 --> 00:26:52.390
But first, a definition.

00:26:52.390 --> 00:26:55.210
Here are three representative
tic-tac-toe boards.

00:26:55.210 --> 00:26:58.570
Here is one in which O has
clearly won, per the green.

00:26:58.570 --> 00:27:01.537
Here is one in which X has
clearly won, per the green.

00:27:01.537 --> 00:27:03.620
And this one in the middle
just represents a draw.

00:27:03.620 --> 00:27:06.662
Now, there's a bunch of other ways
that tic-tac-toe could end, but here's

00:27:06.662 --> 00:27:08.050
just three representative ones.

00:27:08.050 --> 00:27:10.223
But let's make tic-tac-toe
even more boring

00:27:10.223 --> 00:27:11.890
than it might have always struck you as.

00:27:11.890 --> 00:27:15.130
Let's propose that this
kind of configuration

00:27:15.130 --> 00:27:17.230
should have a score of negative 1.

00:27:17.230 --> 00:27:19.030
If O wins, it's a negative 1.

00:27:19.030 --> 00:27:21.340
If X wins, it's a positive 1.

00:27:21.340 --> 00:27:23.350
And if no one wins, we'll call it a 0.

00:27:23.350 --> 00:27:27.280
We need some way of talking about and
reasoning about which of these outcomes

00:27:27.280 --> 00:27:28.520
is better than the other.

00:27:28.520 --> 00:27:31.450
And what's simpler than
0, 1 and negative 1?

00:27:31.450 --> 00:27:33.760
So the goal though, of
X, it would seem, is

00:27:33.760 --> 00:27:38.530
to maximize its score, but the
goal of O is to minimize its score.

00:27:38.530 --> 00:27:42.400
So X is really trying to get positive
1, O is really trying to get negative 1.

00:27:42.400 --> 00:27:46.610
And no one really wants 0, but that's
better than losing to the other person.

00:27:46.610 --> 00:27:49.900
So we have now a way to define
what it means to win or lose.

00:27:49.900 --> 00:27:52.790
Well, now we can employ a strategy here.

00:27:52.790 --> 00:27:56.210
Here, just as a quick check, what
would the score be of this board?

00:27:56.210 --> 00:27:58.020
Just so everyone's on the same page.

00:27:58.020 --> 00:27:58.520
AUDIENCE: 1.

00:27:58.520 --> 00:28:02.000
DAVID J. MALAN: Or, so 1, because X has
one and we just stipulated arbitrarily,

00:28:02.000 --> 00:28:04.190
this means that this
board has a value of 1.

00:28:04.190 --> 00:28:06.740
Now let's put it into a
more interesting context.

00:28:06.740 --> 00:28:09.320
Here, a game has been played
for a few moves already.

00:28:09.320 --> 00:28:10.890
There's two spots left.

00:28:10.890 --> 00:28:12.590
No one has won just yet.

00:28:12.590 --> 00:28:14.982
And suppose that it's O's turn now.

00:28:14.982 --> 00:28:17.690
Now, everyone probably has an
instinct already as to where to go,

00:28:17.690 --> 00:28:20.510
but let's try to break this
down more algorithmically.

00:28:20.510 --> 00:28:22.430
So what is the value of this board?

00:28:22.430 --> 00:28:25.430
Well, we don't know yet,
because no one has won,

00:28:25.430 --> 00:28:28.440
so let's consider what
could happen next.

00:28:28.440 --> 00:28:31.310
So we can draw this actually
as a tree, as before.

00:28:31.310 --> 00:28:33.470
Here, for instance, is
what might happen if O

00:28:33.470 --> 00:28:35.270
goes into the top left-hand corner.

00:28:35.270 --> 00:28:39.830
And here's what might happen if O goes
into the bottom middle spot instead.

00:28:39.830 --> 00:28:42.530
We should ask ourselves, what's
the value of this board, what's

00:28:42.530 --> 00:28:43.530
the value of this board?

00:28:43.530 --> 00:28:46.340
Because if O's purpose in
life is to minimize its score,

00:28:46.340 --> 00:28:49.850
it's going to go left or right based on
whichever yields the smallest number.

00:28:49.850 --> 00:28:51.390
Negative 1, ideally.

00:28:51.390 --> 00:28:55.230
But we're still not sure yet, because
we don't have definitions for boards

00:28:55.230 --> 00:28:56.770
with holes in them like this.

00:28:56.770 --> 00:28:58.380
So what could happen next here?

00:28:58.380 --> 00:29:00.480
Well, it's obviously
going to be X's turn next.

00:29:00.480 --> 00:29:05.080
So if X moves, unfortunately, X
has one in this configuration.

00:29:05.080 --> 00:29:08.980
We can now conclude that the value
of this board is what number?

00:29:08.980 --> 00:29:09.480
AUDIENCE: 1.

00:29:09.480 --> 00:29:10.620
DAVID J. MALAN: So 1.

00:29:10.620 --> 00:29:14.970
And because there's only one way to
reach this board, by transitivity,

00:29:14.970 --> 00:29:19.080
you might as well think of the value
of this previous board as also 1,

00:29:19.080 --> 00:29:21.760
because no matter what, it's going
to lead to that same outcome.

00:29:21.760 --> 00:29:25.890
And so the value of this board is
actually still to be determined,

00:29:25.890 --> 00:29:28.440
because we don't know if O is
going to want to go with the 1,

00:29:28.440 --> 00:29:30.600
and probably not, because
that means X wins.

00:29:30.600 --> 00:29:32.520
But let's see what the
value of this board is.

00:29:32.520 --> 00:29:36.370
Well, suppose that indeed, X goes
in that top left corner here.

00:29:36.370 --> 00:29:39.540
What's the value of this board here?

00:29:39.540 --> 00:29:41.140
0, because no one has won.

00:29:41.140 --> 00:29:43.390
There's no X's or O's three in a row.

00:29:43.390 --> 00:29:45.000
So the value of this board is 0.

00:29:45.000 --> 00:29:47.140
There's only one way
logically to get there,

00:29:47.140 --> 00:29:50.190
so we might as well think of the
value of this board as also 0.

00:29:50.190 --> 00:29:53.100
And so now, what's the
value of this board?

00:29:53.100 --> 00:29:56.370
Well, if we started the story
by thinking about O's turn,

00:29:56.370 --> 00:30:01.860
O's purpose is the min in minimax,
then which move is O going to make?

00:30:01.860 --> 00:30:05.030
Go to the left or go to the right?

00:30:05.030 --> 00:30:06.800
O's was probably going
to go to the right

00:30:06.800 --> 00:30:10.880
and make the move that leads to,
whoops, that leads to this board,

00:30:10.880 --> 00:30:15.200
because even though O can't win in this
configuration, at least X didn't win.

00:30:15.200 --> 00:30:19.190
So it's minimized its score relatively,
even though it's not a clean win.

00:30:19.190 --> 00:30:21.500
Now, this is all fine and
good for a configuration

00:30:21.500 --> 00:30:23.243
of the board that's like almost done.

00:30:23.243 --> 00:30:24.410
There's only two moves left.

00:30:24.410 --> 00:30:25.770
The game's about to end.

00:30:25.770 --> 00:30:27.830
But if you kind of expand
in your mind's eye,

00:30:27.830 --> 00:30:30.810
how did we get to this
branch of the decision tree,

00:30:30.810 --> 00:30:34.010
if we rewind one step where
there's three possible moves,

00:30:34.010 --> 00:30:36.260
frankly, the decision
tree is a lot bigger.

00:30:36.260 --> 00:30:39.350
If we rewind further in your
mind's eye and have four moves

00:30:39.350 --> 00:30:41.760
left or five moves or
all nine moves left,

00:30:41.760 --> 00:30:43.550
imagine just zooming out, out, and out.

00:30:43.550 --> 00:30:46.940
This is becoming a massive,
massive tree of decisions.

00:30:46.940 --> 00:30:51.110
Now, even so, here is that same
subtree, the same decision tree

00:30:51.110 --> 00:30:51.860
we just looked at.

00:30:51.860 --> 00:30:54.050
This is the exact same thing,
but I shrunk the font so

00:30:54.050 --> 00:30:55.760
that it appears here on the screen here.

00:30:55.760 --> 00:30:59.660
But over here, we have what
could happen if instead,

00:30:59.660 --> 00:31:03.680
it's actually X's turn,
because we're one move prior.

00:31:03.680 --> 00:31:06.420
There's a bunch of different
moves X could now make, too.

00:31:06.420 --> 00:31:08.350
So what is the implication of this?

00:31:08.350 --> 00:31:12.930
Well, most humans are not thinking
through tic-tac-toe to this extreme.

00:31:12.930 --> 00:31:15.780
And frankly, most of us probably
just don't have the mental capacity

00:31:15.780 --> 00:31:18.360
to think about going left and then
right and then left and then right.

00:31:18.360 --> 00:31:18.860
Right?

00:31:18.860 --> 00:31:20.610
This is not how people play tic-tac-toe.

00:31:20.610 --> 00:31:23.190
Like, we're not using that
much memory, so to speak.

00:31:23.190 --> 00:31:26.010
But a computer can handle
that, and computers

00:31:26.010 --> 00:31:27.850
can play tic-tac-toe optimally.

00:31:27.850 --> 00:31:30.360
So if you're beating a
computer at tic-tac-toe, like,

00:31:30.360 --> 00:31:31.770
it's not implemented very well.

00:31:31.770 --> 00:31:36.420
It's not following this very logical,
deterministic minimax algorithm.

00:31:36.420 --> 00:31:40.470
But this is where now AI is
no longer as simple as just

00:31:40.470 --> 00:31:42.570
doing what these decision trees say.

00:31:42.570 --> 00:31:45.780
In the context of tic-tac-toe,
here's how we might translate this

00:31:45.780 --> 00:31:46.870
to code, for instance.

00:31:46.870 --> 00:31:49.830
If player is X, for each
possible move, calculate

00:31:49.830 --> 00:31:52.200
a score for the board, as
we were doing verbally,

00:31:52.200 --> 00:31:54.600
and then choose the move
with the highest score.

00:31:54.600 --> 00:31:57.420
Because X's goal is
to maximize its score.

00:31:57.420 --> 00:32:00.090
If the player is O, though,
for each possible move,

00:32:00.090 --> 00:32:02.010
calculate a score for
the board, and then

00:32:02.010 --> 00:32:04.210
choose the move with the lowest score.

00:32:04.210 --> 00:32:06.600
So that's a distillation
of that verbal walkthrough

00:32:06.600 --> 00:32:10.290
into what CS50 students know now
as code, or at least pseudocode.

00:32:10.290 --> 00:32:15.120
But the problem with games, not so
much tic-tac-toe, but other more

00:32:15.120 --> 00:32:16.650
sophisticated games is this.

00:32:16.650 --> 00:32:19.890
Does anyone want to ballpark
how many possible ways there

00:32:19.890 --> 00:32:22.940
are to play tic-tac-toe?

00:32:22.940 --> 00:32:26.180
Paper, pencil, two human
children, how many different ways?

00:32:26.180 --> 00:32:30.893
How long could you keep them occupied
playing tic-tac-toe in different ways?

00:32:30.893 --> 00:32:33.310
If you actually think through,
how big does this tree get,

00:32:33.310 --> 00:32:36.160
how many leaves are there on
this decision tree, like how many

00:32:36.160 --> 00:32:42.520
different directions, well, if you're
thinking 255,168, you are correct.

00:32:42.520 --> 00:32:44.980
And now most of us in our
lifetime have probably not

00:32:44.980 --> 00:32:47.180
played tic-tac-toe that many times.

00:32:47.180 --> 00:32:49.660
So think about how many games
you've been missing out on.

00:32:49.660 --> 00:32:53.230
There are different decisions you
could have been making all these years.

00:32:53.230 --> 00:32:57.380
Now, that's a big number, but honestly,
that's not a big number for a computer.

00:32:57.380 --> 00:33:01.420
That's a few megabytes of memory
maybe, to keep all of that in mind

00:33:01.420 --> 00:33:06.160
and implement that kind of code in
C or Java or C++ or something else.

00:33:06.160 --> 00:33:08.990
But other games are
much more complicated.

00:33:08.990 --> 00:33:11.860
And the games that you and I
might play as we get older,

00:33:11.860 --> 00:33:13.330
they include maybe chess.

00:33:13.330 --> 00:33:17.560
And if you think about chess with only
the first four moves, back and forth

00:33:17.560 --> 00:33:19.750
four times, so only four moves.

00:33:19.750 --> 00:33:21.430
That's not even a very long game.

00:33:21.430 --> 00:33:23.830
Anyone want a ballpark
how many different ways

00:33:23.830 --> 00:33:28.390
there are to begin a game of chess
with four moves back and forth?

00:33:31.490 --> 00:33:34.300
This is evidence as to why
chess is apparently so hard.

00:33:34.300 --> 00:33:40.030
288 million ways, which is why
when you are really good at chess,

00:33:40.030 --> 00:33:41.680
you are really good at chess.

00:33:41.680 --> 00:33:44.350
Because apparently, you
either have an intuition for

00:33:44.350 --> 00:33:47.950
or a mind for thinking it would
seem so many more steps ahead

00:33:47.950 --> 00:33:48.860
than your opponent.

00:33:48.860 --> 00:33:50.777
And don't get us started
on something like Go.

00:33:50.777 --> 00:33:55.570
266 quintillion ways to
play Go's first four moves.

00:33:55.570 --> 00:33:59.110
So at this point, we just
can't pull out our Mac, our PC,

00:33:59.110 --> 00:34:03.190
certainly not our phone, to solve
optimally games like chess and Go,

00:34:03.190 --> 00:34:05.323
because we don't have big enough CPUs.

00:34:05.323 --> 00:34:06.490
We don't have enough memory.

00:34:06.490 --> 00:34:09.610
We don't have enough years in
our lifetimes for the computers

00:34:09.610 --> 00:34:11.110
to crunch all of those numbers.

00:34:11.110 --> 00:34:14.230
And thus was born a
different form of AI that's

00:34:14.230 --> 00:34:18.520
more inspired by finding
patterns more dynamically,

00:34:18.520 --> 00:34:22.239
learning from data, as opposed
to being told by humans, here

00:34:22.239 --> 00:34:25.070
is the code via which
to solve this problem.

00:34:25.070 --> 00:34:28.330
So machine learning is a subset
of artificial intelligence

00:34:28.330 --> 00:34:30.980
that tries instead to
get machines to learn

00:34:30.980 --> 00:34:35.900
what they should do without being so
coached step by step by step by humans

00:34:35.900 --> 00:34:36.409
here.

00:34:36.409 --> 00:34:39.500
Reinforcement learning, for instance,
is one such example thereof,

00:34:39.500 --> 00:34:41.690
wherein reinforcement
learning, you sort of wait

00:34:41.690 --> 00:34:44.480
for the computer or maybe
a robot to maybe just get

00:34:44.480 --> 00:34:46.380
better and better and better at things.

00:34:46.380 --> 00:34:48.710
And as it does, you reward
it with a reward function.

00:34:48.710 --> 00:34:50.960
Give it plus 1 every time
it does something well.

00:34:50.960 --> 00:34:51.830
And maybe minus 1.

00:34:51.830 --> 00:34:54.080
You punish it any time
it does something poorly.

00:34:54.080 --> 00:35:00.110
And if you simply program this AI
or this robot to maximize its score,

00:35:00.110 --> 00:35:02.390
never mind minimizing,
maximize its score,

00:35:02.390 --> 00:35:05.570
ideally, it should repeat
behaviors that got it plus 1.

00:35:05.570 --> 00:35:07.820
It should decrease the
frequency with which it does

00:35:07.820 --> 00:35:09.710
bad behaviors that got it negative 1.

00:35:09.710 --> 00:35:12.080
And you can reinforce
this kind of learning.

00:35:12.080 --> 00:35:15.230
In fact, I have here one demonstration.

00:35:15.230 --> 00:35:18.380
Could a student come on
up who does not think

00:35:18.380 --> 00:35:20.960
they are particularly coordinated?

00:35:20.960 --> 00:35:24.020
If-- OK, wow, you're being
nominated by your friends.

00:35:24.020 --> 00:35:24.950
Come on up.

00:35:24.950 --> 00:35:26.283
Come on up.

00:35:26.283 --> 00:35:28.598
[LAUGHTER]

00:35:29.530 --> 00:35:31.720
Their hands went up instantly for you.

00:35:34.260 --> 00:35:36.290
OK, what is your name?

00:35:36.290 --> 00:35:37.420
AMAKA: My name's Amaka.

00:35:37.420 --> 00:35:39.130
DAVID J. MALAN: Amaka, do you want
to introduce yourself to the world?

00:35:39.130 --> 00:35:40.330
AMAKA: Hi, my name is Amaka.

00:35:40.330 --> 00:35:42.250
I am a first year in Holworthy.

00:35:42.250 --> 00:35:43.667
I'm planning to concentrate in CS.

00:35:43.667 --> 00:35:44.750
DAVID J. MALAN: Wonderful.

00:35:44.750 --> 00:35:45.550
Nice to see you.

00:35:45.550 --> 00:35:46.690
Come on over here.

00:35:46.690 --> 00:35:49.540
[APPLAUSE]

00:35:49.540 --> 00:35:52.900
So, yes, oh, no, it's sort
of like a game show here.

00:35:52.900 --> 00:35:57.520
We have a pan here with what appears
to be something pancake-like.

00:35:57.520 --> 00:36:00.970
And we'd like to teach
you how to flip a pancake,

00:36:00.970 --> 00:36:04.250
so that when you gesture upward,
the pancake should flip around

00:36:04.250 --> 00:36:05.900
as though you cooked the other side.

00:36:05.900 --> 00:36:09.400
So we're going to reward you
verbally with plus 1 or minus 1.

00:36:11.980 --> 00:36:13.450
Minus 1.

00:36:13.450 --> 00:36:15.470
Minus 1.

00:36:15.470 --> 00:36:17.050
OK, plus 1!

00:36:17.050 --> 00:36:19.690
Plus 1, so do more of that.

00:36:19.690 --> 00:36:20.920
Minus 1.

00:36:20.920 --> 00:36:22.840
Minus 1.

00:36:22.840 --> 00:36:23.890
Minus 1.

00:36:23.890 --> 00:36:25.150
Do less of that.

00:36:25.150 --> 00:36:27.370
[LAUGHTER]

00:36:27.370 --> 00:36:28.517
AUDIENCE: Great, great.

00:36:28.517 --> 00:36:29.600
DAVID J. MALAN: All right!

00:36:29.600 --> 00:36:30.655
A big round of applause.

00:36:30.655 --> 00:36:32.890
[APPLAUSE]

00:36:32.890 --> 00:36:33.670
Thank you.

00:36:33.670 --> 00:36:37.340
We've been in the habit of handing out
Super Mario Brothers Oreos this year,

00:36:37.340 --> 00:36:39.220
so thank you for participating.

00:36:39.220 --> 00:36:41.600
[APPLAUSE]

00:36:43.030 --> 00:36:46.590
So, this is actually a good
example of an opportunity

00:36:46.590 --> 00:36:47.940
for reinforcement learning.

00:36:47.940 --> 00:36:51.310
And wonderfully, a researcher has posted
a video that we thought we'd share.

00:36:51.310 --> 00:36:53.060
It's about a minute
and a half long, where

00:36:53.060 --> 00:36:57.570
you can watch a robot now do exactly
what our wonderful human volunteer here

00:36:57.570 --> 00:36:59.050
just attempted as well.

00:36:59.050 --> 00:37:01.560
So let me go ahead and
play this on the screen

00:37:01.560 --> 00:37:05.380
and give you a sense of what the human
and the robot are doing together.

00:37:05.380 --> 00:37:08.790
So their pancake looks
a little similar there.

00:37:08.790 --> 00:37:12.360
The human here is going to first
sort of train the robot what

00:37:12.360 --> 00:37:14.190
to do by showing it some gestures.

00:37:14.190 --> 00:37:16.360
But there's no one right way to do this.

00:37:16.360 --> 00:37:19.660
But the human seems to know how
to do it pretty well in this case,

00:37:19.660 --> 00:37:23.040
and so it's trying to
give the machine examples

00:37:23.040 --> 00:37:24.990
of how to flip a pancake successfully.

00:37:24.990 --> 00:37:27.810
But now, this is the very first trial.

00:37:27.810 --> 00:37:28.560
OK, look familiar?

00:37:28.560 --> 00:37:30.300
You're in good company.

00:37:30.300 --> 00:37:32.652
After three trials.

00:37:32.652 --> 00:37:33.456
[CLANG]

00:37:33.456 --> 00:37:34.260
[PLOP]

00:37:34.260 --> 00:37:36.020
OK.

00:37:36.020 --> 00:37:36.520
[CLANG]

00:37:36.520 --> 00:37:37.410
[PLOP]

00:37:37.410 --> 00:37:39.060
OK.

00:37:39.060 --> 00:37:42.690
Now 10 tries.

00:37:42.690 --> 00:37:46.020
There's the human
picking up the pancake.

00:37:46.020 --> 00:37:48.780
After 11 trials--

00:37:48.780 --> 00:37:49.680
[CLANG]

00:37:49.680 --> 00:37:51.930
[PLOP]

00:37:51.930 --> 00:37:54.270
And meanwhile, there's
presumably a human coding this,

00:37:54.270 --> 00:38:00.090
in the sense that someone is saying
good job or bad job, plus 1 or minus 1.

00:38:00.090 --> 00:38:03.870
20 trials.

00:38:03.870 --> 00:38:07.440
Here now we'll see how the computer
knows what it's even doing.

00:38:07.440 --> 00:38:10.720
There's just a mapping to some
kind of XYZ coordinate system.

00:38:10.720 --> 00:38:13.260
So the robot can quantize
what it is it's doing.

00:38:13.260 --> 00:38:14.100
Nice!

00:38:14.100 --> 00:38:16.447
To do more of one
thing, less of another.

00:38:16.447 --> 00:38:18.780
And you're just seeing a
visualization in the background

00:38:18.780 --> 00:38:21.720
of those digitized movements.

00:38:21.720 --> 00:38:28.020
And so now, after 50 some odd trials,
the robot, too, has got it spot on.

00:38:28.020 --> 00:38:30.420
And it should be able to
repeat this again and again

00:38:30.420 --> 00:38:33.000
and again, in order to
keep flipping this pancake.

00:38:33.000 --> 00:38:36.360
So our human volunteer wonderfully
took you even fewer trials.

00:38:36.360 --> 00:38:38.340
But this is an example
then, to be clear,

00:38:38.340 --> 00:38:40.800
of what we'd call
reinforcement learning,

00:38:40.800 --> 00:38:44.725
whereby you're reinforcing a behavior
you want or negatively reinforcing.

00:38:44.725 --> 00:38:46.600
That is, punishing a
behavior that you don't.

00:38:46.600 --> 00:38:48.350
Here's another example
that brings us back

00:38:48.350 --> 00:38:51.850
into the realm of games a little
bit, but in a very abstract way.

00:38:51.850 --> 00:38:53.918
If we were playing a game
like The Floor Is Lava,

00:38:53.918 --> 00:38:56.710
where you're only supposed to step
certain places so that you don't

00:38:56.710 --> 00:38:59.585
fall straight in the lava pit or
something like that and lose a point

00:38:59.585 --> 00:39:02.920
or lose a life, each of these
squares might represent a position.

00:39:02.920 --> 00:39:06.470
This yellow dot might represent the
human player that can go up, down,

00:39:06.470 --> 00:39:08.240
left or right within this world.

00:39:08.240 --> 00:39:11.170
I'm revealing to the whole
audience where the lava pits are.

00:39:11.170 --> 00:39:13.930
But the goal for this yellow
dot is to get to green.

00:39:13.930 --> 00:39:17.530
But the yellow dot, as in any good
game, does not have this bird's eye view

00:39:17.530 --> 00:39:19.930
and knows from the get-go
exactly where to go.

00:39:19.930 --> 00:39:22.040
It's going to have to
try some trial and error.

00:39:22.040 --> 00:39:25.300
But if we, the programmers,
maybe reinforce good behavior

00:39:25.300 --> 00:39:28.810
or punish bad behavior, we
can teach this yellow dot,

00:39:28.810 --> 00:39:31.550
without giving it step
by step, up, down,

00:39:31.550 --> 00:39:34.600
left, right instructions,
what behaviors to repeat

00:39:34.600 --> 00:39:36.460
and what behaviors not to repeat.

00:39:36.460 --> 00:39:38.665
So, for instance, suppose
the robot moves right.

00:39:38.665 --> 00:39:39.520
Ah, that was bad.

00:39:39.520 --> 00:39:42.610
You fell in the lava already, so
we'll use a bit of computer memory

00:39:42.610 --> 00:39:45.100
to draw a thicker red line there.

00:39:45.100 --> 00:39:46.220
Don't do that again.

00:39:46.220 --> 00:39:47.830
So, negative 1, so to speak.

00:39:47.830 --> 00:39:49.780
Maybe the yellow dot moves up next time.

00:39:49.780 --> 00:39:53.290
We can reward that behavior
by not drawing any walls

00:39:53.290 --> 00:39:54.580
and allowing it to go again.

00:39:54.580 --> 00:39:57.970
It's making pretty good progress,
but, oh, darn it, it took a right turn

00:39:57.970 --> 00:39:59.230
and now fell into the lava.

00:39:59.230 --> 00:40:01.490
But let's use a bit more
of the computer's memory

00:40:01.490 --> 00:40:04.750
and keep track of the, OK,
do not do that thing anymore.

00:40:04.750 --> 00:40:07.270
Maybe the next time the
human dot goes this way.

00:40:07.270 --> 00:40:09.370
Oh, we want to punish
that behavior, so we'll

00:40:09.370 --> 00:40:11.140
remember as much with that red line.

00:40:11.140 --> 00:40:15.040
But now we're starting to make progress
until, oh, now we hit this one.

00:40:15.040 --> 00:40:18.340
And eventually, even though the
yellow dot, much like our human,

00:40:18.340 --> 00:40:22.780
much like our pancake flipping robot
had to try again and again and again,

00:40:22.780 --> 00:40:26.710
after enough trials, it's going to start
to realize what behaviors it should

00:40:26.710 --> 00:40:28.880
repeat and which ones it shouldn't.

00:40:28.880 --> 00:40:32.740
And so in this case, maybe it finally
makes its way up to the green dot.

00:40:32.740 --> 00:40:35.050
And just to recap, once
it finds that path,

00:40:35.050 --> 00:40:38.620
now it can remember it forever as
with these green thicker lines.

00:40:38.620 --> 00:40:41.470
Any time you want to leave this
map, any time you get really good

00:40:41.470 --> 00:40:44.650
at the Nintendo game, you follow
that same path again and again,

00:40:44.650 --> 00:40:46.420
so you don't fall into the lava.

00:40:46.420 --> 00:40:51.160
But an astute human observer might
realize that, yes, this is correct.

00:40:51.160 --> 00:40:53.590
It's getting out of this so-called maze.

00:40:53.590 --> 00:40:56.315
But what is suboptimal or
bad about this solution?

00:40:56.315 --> 00:40:56.815
Sure.

00:40:56.815 --> 00:40:58.513
AUDIENCE: It's taking
a really long time.

00:40:58.513 --> 00:40:59.900
It's not the most
efficient way to get there.

00:40:59.900 --> 00:41:00.500
DAVID J. MALAN: Exactly.

00:41:00.500 --> 00:41:01.792
It's taking a really long time.

00:41:01.792 --> 00:41:04.190
An inefficient way to get
there, because I dare say,

00:41:04.190 --> 00:41:07.280
if we just tried a
different path occasionally,

00:41:07.280 --> 00:41:11.480
maybe we could get lucky
and get to the exit quicker.

00:41:11.480 --> 00:41:14.930
And maybe that means we get a higher
score or we get rewarded even more.

00:41:14.930 --> 00:41:18.140
So within a lot of artificial
intelligence algorithms,

00:41:18.140 --> 00:41:21.230
there's this idea of
exploring versus exploiting,

00:41:21.230 --> 00:41:26.000
whereby you should occasionally, yes,
exploit the knowledge you already have.

00:41:26.000 --> 00:41:28.010
And in fact, frequently
exploit that knowledge.

00:41:28.010 --> 00:41:30.260
But occasionally you know
what you should probably do,

00:41:30.260 --> 00:41:31.550
is explore just a little bit.

00:41:31.550 --> 00:41:34.550
Take a left instead of a right and
see if it leads you to the solution

00:41:34.550 --> 00:41:35.390
even more quickly.

00:41:35.390 --> 00:41:37.620
And you might find a
better and better solution.

00:41:37.620 --> 00:41:40.100
So here mathematically is
how we might think of this.

00:41:40.100 --> 00:41:44.690
10% of the time we might say that
epsilon, just some variable, sort

00:41:44.690 --> 00:41:47.780
of a sprinkling of salt into
the algorithm here, epsilon

00:41:47.780 --> 00:41:49.320
will be like 10% of the time.

00:41:49.320 --> 00:41:54.512
So if my robot or my player picks a
random number that's less than 10%,

00:41:54.512 --> 00:41:55.970
that's going to make a random move.

00:41:55.970 --> 00:41:59.270
Go left instead of right, even
if you really typically go right.

00:41:59.270 --> 00:42:01.650
Otherwise, guys make the
move with the highest value,

00:42:01.650 --> 00:42:03.090
as we've learned over time.

00:42:03.090 --> 00:42:06.420
And what the robot might learn
then, is that we could actually

00:42:06.420 --> 00:42:10.290
go via this path, which gets
us to the output faster.

00:42:10.290 --> 00:42:13.313
We get a higher score, we do it
in less time, it's a win-win.

00:42:13.313 --> 00:42:15.480
Frankly, this really resonates
with me, because I've

00:42:15.480 --> 00:42:19.068
been in the habit, as maybe some of you
are, when you go to a restaurant maybe

00:42:19.068 --> 00:42:21.360
that you really like, you
find a dish you really like--

00:42:21.360 --> 00:42:24.120
--I will never again know what
other dishes that restaurant

00:42:24.120 --> 00:42:28.440
offers, because I'm locally optimally
happy with the dish I've chosen.

00:42:28.440 --> 00:42:31.800
And I will never know if there's an
even better dish at that restaurant

00:42:31.800 --> 00:42:34.320
unless again, I sort of sprinkle
a little bit of epsilon,

00:42:34.320 --> 00:42:38.730
a little bit of randomness into
my game playing, my dining out.

00:42:38.730 --> 00:42:41.640
The catch, of course, though,
is that I might be punished.

00:42:41.640 --> 00:42:45.360
I might, therefore, be less happy if
I pick something and I don't like it.

00:42:45.360 --> 00:42:48.120
So there's this tension between
exploring and exploiting.

00:42:48.120 --> 00:42:50.700
But in general in computer
science, and especially in AI,

00:42:50.700 --> 00:42:53.220
adding a little bit of
randomness, especially over time,

00:42:53.220 --> 00:42:56.320
can, in fact, yield better
and better outcomes.

00:42:56.320 --> 00:42:59.400
But now there's this notion
all the more of deep learning,

00:42:59.400 --> 00:43:02.910
whereby you're trying to
infer, to detect patterns,

00:43:02.910 --> 00:43:06.120
figure out how to solve problems,
even if the AI has never

00:43:06.120 --> 00:43:10.170
seen those problems before, and even
if there's no human there to reinforce

00:43:10.170 --> 00:43:12.720
positive or negatively behavior.

00:43:12.720 --> 00:43:15.390
Maybe it's just too complex
of a problem for a human

00:43:15.390 --> 00:43:18.415
to stand alongside the robot
and say, good or bad job.

00:43:18.415 --> 00:43:20.790
So with deep learning, they're
actually very much related

00:43:20.790 --> 00:43:24.210
to what you might know as neural
networks, inspired by human physiology,

00:43:24.210 --> 00:43:26.580
whereby inside of our brains
and elsewhere in our body,

00:43:26.580 --> 00:43:28.372
there's lots of these
neurons here that can

00:43:28.372 --> 00:43:30.480
send electrical signals
to make movements

00:43:30.480 --> 00:43:32.220
happen from brain to extremities.

00:43:32.220 --> 00:43:35.520
You might have two of
these via which signals can

00:43:35.520 --> 00:43:37.810
be transmitted over a larger distance.

00:43:37.810 --> 00:43:41.760
And so computer scientists for
some time have drawn inspiration

00:43:41.760 --> 00:43:46.560
from these neurons to create in
software, what we call neural networks.

00:43:46.560 --> 00:43:49.240
Whereby, there's inputs
to these networks

00:43:49.240 --> 00:43:52.230
and there's outputs from these
networks that represents inputs

00:43:52.230 --> 00:43:54.450
to problems and solutions thereto.

00:43:54.450 --> 00:43:56.910
So let me abstract away the
more biological diagrams

00:43:56.910 --> 00:44:00.970
with just circles that represent
nodes, or neurons, in this case.

00:44:00.970 --> 00:44:03.450
This we would call in CS50, the input.

00:44:03.450 --> 00:44:05.520
This is what we would call the output.

00:44:05.520 --> 00:44:08.680
But this is a very simplistic,
a very simple neural network.

00:44:08.680 --> 00:44:11.760
This might be more common,
whereby the network, the AI

00:44:11.760 --> 00:44:15.900
takes two inputs to a problem and
tries to give you one solution.

00:44:15.900 --> 00:44:17.760
Well, let's make this more real.

00:44:17.760 --> 00:44:20.760
For instance, suppose that at the--

00:44:20.760 --> 00:44:23.970
suppose that just for the sake of
discussion, here is like a grid

00:44:23.970 --> 00:44:27.180
that you might see in math class, with
a y-axis and an x-axis, vertically

00:44:27.180 --> 00:44:28.620
and horizontally respectively.

00:44:28.620 --> 00:44:31.980
Suppose there's a couple of
blue and red dots in that world.

00:44:31.980 --> 00:44:34.890
And suppose that our
goal, computationally,

00:44:34.890 --> 00:44:40.020
is to predict whether a dot is
going to be blue or red, based

00:44:40.020 --> 00:44:42.960
on its position within
that coordinate system.

00:44:42.960 --> 00:44:45.002
And maybe this represents
some real world notion.

00:44:45.002 --> 00:44:47.502
Maybe it's something like rain
that we're trying to predict.

00:44:47.502 --> 00:44:49.920
But we're doing it more
simply with colors right now.

00:44:49.920 --> 00:44:53.010
So here's my y-axis, here's
my x-axis, and effectively,

00:44:53.010 --> 00:44:55.740
my neural network you can
think of conceptually as this.

00:44:55.740 --> 00:44:58.393
It's some kind of
implementation of software

00:44:58.393 --> 00:45:00.060
where there's two inputs to the problem.

00:45:00.060 --> 00:45:01.990
Give me an x, give me a y value.

00:45:01.990 --> 00:45:06.540
And this neural network will output
red or blue as its prediction.

00:45:06.540 --> 00:45:08.790
Well, how does it know whether
to predict red or blue,

00:45:08.790 --> 00:45:12.030
especially if no human has
painstakingly written code

00:45:12.030 --> 00:45:15.360
to say when you see a dot
here, conclude that it's red.

00:45:15.360 --> 00:45:17.490
When you see a dot here,
conclude that it's blue.

00:45:17.490 --> 00:45:21.160
How can an AI just learn
dynamically to solve problems?

00:45:21.160 --> 00:45:23.460
Well, what might be a
reasonable heuristic here?

00:45:23.460 --> 00:45:26.757
Honestly, this is probably a first
approximation that's pretty good.

00:45:26.757 --> 00:45:29.340
If anything's to the left of
that line, let the neural network

00:45:29.340 --> 00:45:30.630
conclude that it's going to be blue.

00:45:30.630 --> 00:45:32.010
And if it's to the
right of the line, let

00:45:32.010 --> 00:45:33.593
it conclude that it's going to be red.

00:45:33.593 --> 00:45:36.690
Until such time as there's
more training data,

00:45:36.690 --> 00:45:40.203
more real world data that gets
us to rethink our assumptions.

00:45:40.203 --> 00:45:42.120
So for instance, if
there's a third dot there,

00:45:42.120 --> 00:45:44.830
uh-oh, clearly a straight
line is not sufficient.

00:45:44.830 --> 00:45:48.960
So maybe it's more of a diagonal line
that splits the blue from the red world

00:45:48.960 --> 00:45:49.600
here.

00:45:49.600 --> 00:45:51.660
Meanwhile, here's even more dots.

00:45:51.660 --> 00:45:53.580
And it's actually getting harder now.

00:45:53.580 --> 00:45:55.230
Like, this line is still pretty good.

00:45:55.230 --> 00:45:56.610
Most of the blue is up here.

00:45:56.610 --> 00:45:58.240
Most of the red is down here.

00:45:58.240 --> 00:46:02.100
And this is why, if we fast forward to
today, you know, AI is often very good,

00:46:02.100 --> 00:46:04.630
but not perfect at solving problems.

00:46:04.630 --> 00:46:07.890
But what is it we're looking at here,
and what is this neural network really

00:46:07.890 --> 00:46:09.250
trying to figure out?

00:46:09.250 --> 00:46:12.870
Well, again, at the risk of taking
some fun out of red and blue dots,

00:46:12.870 --> 00:46:16.890
you can think of this neural network
as indeed having these neurons, which

00:46:16.890 --> 00:46:19.590
represent inputs here and outputs here.

00:46:19.590 --> 00:46:22.200
And then what's happening
inside of the computer's memory,

00:46:22.200 --> 00:46:26.320
is that it's trying to figure out
what the weight of this arrow or edge

00:46:26.320 --> 00:46:26.820
should be.

00:46:26.820 --> 00:46:29.132
What the weight of this
arrow or edge should be.

00:46:29.132 --> 00:46:30.840
And maybe there's
another variable there,

00:46:30.840 --> 00:46:33.910
like plus or minus C that
just tweaks the prediction.

00:46:33.910 --> 00:46:37.540
So x and y are literally going
to be numbers in this scenario.

00:46:37.540 --> 00:46:40.890
And the output of this neural network
ideally is just true or false.

00:46:40.890 --> 00:46:42.310
Is it red or blue?

00:46:42.310 --> 00:46:45.330
So it's sort of a binary state,
as we discuss a lot in CS50.

00:46:45.330 --> 00:46:47.987
So here too, to take the fun
out of the pretty picture,

00:46:47.987 --> 00:46:50.070
it's really just like a
high school math function.

00:46:50.070 --> 00:46:53.160
What the neural network in this
example is trying to figure out,

00:46:53.160 --> 00:46:57.540
is what formula of the
form ax plus by plus c

00:46:57.540 --> 00:46:59.680
is going to be arbitrarily
greater than 0?

00:46:59.680 --> 00:47:02.150
And if so, let's conclude
that the dot is red

00:47:02.150 --> 00:47:05.140
if you get back a positive
result. If you don't, let's

00:47:05.140 --> 00:47:08.558
conclude that the dot is
going to be blue instead.

00:47:08.558 --> 00:47:10.600
So really what you're
trying to do, is figure out

00:47:10.600 --> 00:47:13.000
dynamically what numbers
do we have to tweak,

00:47:13.000 --> 00:47:15.100
these parameters inside
of the neural network

00:47:15.100 --> 00:47:18.220
that just give us the answer we
want based on all of this data?

00:47:18.220 --> 00:47:22.180
More generally though, this would be
really representative of deep learning.

00:47:22.180 --> 00:47:24.490
It's not as simple as
input, input, output.

00:47:24.490 --> 00:47:27.140
There's actually a lot of
these nodes, these neurons.

00:47:27.140 --> 00:47:28.360
There's a lot of these edges.

00:47:28.360 --> 00:47:30.812
There's a lot of numbers
and math are going on that,

00:47:30.812 --> 00:47:33.520
frankly, even the computer scientists
using these neural networks

00:47:33.520 --> 00:47:36.760
don't necessarily know what
they even mean or represent.

00:47:36.760 --> 00:47:39.910
It just happens to be that when
you crunch the numbers with all

00:47:39.910 --> 00:47:44.140
of these parameters in place,
you get the answer that you want,

00:47:44.140 --> 00:47:46.190
at least most of the time.

00:47:46.190 --> 00:47:48.280
So that's essentially the
intuition behind that.

00:47:48.280 --> 00:47:51.340
And you can apply it to very real
world, if mundane applications.

00:47:51.340 --> 00:47:55.000
Given today's humidity, given
today's pressure, yes or no,

00:47:55.000 --> 00:47:56.275
should there be rainfall?

00:47:56.275 --> 00:47:58.150
And maybe there is some
mathematical function

00:47:58.150 --> 00:48:01.120
that based on years of
training data, we can

00:48:01.120 --> 00:48:03.490
infer what that prediction should be.

00:48:03.490 --> 00:48:04.090
Another one.

00:48:04.090 --> 00:48:07.120
Given this amount of
advertising in this month,

00:48:07.120 --> 00:48:09.480
what should our sales be for that year?

00:48:09.480 --> 00:48:11.230
Should they be up, or
should they be down?

00:48:11.230 --> 00:48:13.130
Sorry, for that particular month.

00:48:13.130 --> 00:48:16.090
So real world problems map readily
when you can break them down

00:48:16.090 --> 00:48:20.320
into inputs and a binary output
often, or some kind of output

00:48:20.320 --> 00:48:24.250
where you want the thing to
figure out based on past data what

00:48:24.250 --> 00:48:26.650
its prediction should be.

00:48:26.650 --> 00:48:30.250
So that brings us back to generative
artificial intelligence, which

00:48:30.250 --> 00:48:34.760
isn't just about solving problems, but
really generating literally images,

00:48:34.760 --> 00:48:38.680
texts, even videos, that
again, increasingly resemble

00:48:38.680 --> 00:48:41.920
what we humans might
otherwise output ourselves.

00:48:41.920 --> 00:48:45.370
And within the world of generative
artificial intelligence,

00:48:45.370 --> 00:48:48.310
do we have, of course, these
same images that we saw before,

00:48:48.310 --> 00:48:51.340
the same text that we saw before,
and more generally, things

00:48:51.340 --> 00:48:55.870
like ChatGPT, which are really examples
of what we now call large language

00:48:55.870 --> 00:48:56.560
models.

00:48:56.560 --> 00:48:59.020
These sort of massive
neural networks that

00:48:59.020 --> 00:49:02.590
have so many inputs and so
many neurons implemented

00:49:02.590 --> 00:49:06.280
in software, that essentially
represent all of the patterns

00:49:06.280 --> 00:49:09.850
that the software has discovered by
being fed massive amounts of input.

00:49:09.850 --> 00:49:13.180
Think of it as like the entire
textual content of the internet.

00:49:13.180 --> 00:49:16.180
Think of it as the entire
content of courses like CS50

00:49:16.180 --> 00:49:18.280
that may very well be out
there on the internet.

00:49:18.280 --> 00:49:21.610
And even though these AIs,
these large language models

00:49:21.610 --> 00:49:25.240
haven't been told how to
behave, they're really

00:49:25.240 --> 00:49:28.210
inferring from all of
these examples, for better

00:49:28.210 --> 00:49:31.310
or for worse, how to make predictions.

00:49:31.310 --> 00:49:34.840
So here, for instance, from
2017, just a few years back,

00:49:34.840 --> 00:49:38.110
is a seminal paper from Google
that introduced what we now

00:49:38.110 --> 00:49:40.210
know as a transformer architecture.

00:49:40.210 --> 00:49:43.690
And this introduced this idea
of attention values, whereby

00:49:43.690 --> 00:49:46.900
they propose that given an English
sentence, for instance, or really

00:49:46.900 --> 00:49:51.460
any human sentence, you try to assign
numbers, not unlike our past exercises,

00:49:51.460 --> 00:49:55.780
to each of the words, each of the
inputs that speaks to its relationship

00:49:55.780 --> 00:49:56.930
with other words.

00:49:56.930 --> 00:49:59.720
So if there's a high relationship
between two words in a sentence,

00:49:59.720 --> 00:50:01.310
they would have high attention values.

00:50:01.310 --> 00:50:04.720
And if maybe it's a preposition or
an article, like the or the like,

00:50:04.720 --> 00:50:06.890
maybe those attention values are lower.

00:50:06.890 --> 00:50:09.070
And by encoding the
world in that way, do

00:50:09.070 --> 00:50:14.230
we begin to detect patterns that
allow us to predict things like words,

00:50:14.230 --> 00:50:15.440
that is, generate text.

00:50:15.440 --> 00:50:19.150
So for instance, up until a few
years ago, completing this sentence

00:50:19.150 --> 00:50:21.310
was actually pretty
hard for a lot of AI.

00:50:21.310 --> 00:50:25.180
So for instance here, Massachusetts
is a state in the New England region

00:50:25.180 --> 00:50:26.860
of the Northeastern United States.

00:50:26.860 --> 00:50:29.500
It borders on the Atlantic
Ocean to the east.

00:50:29.500 --> 00:50:32.180
The state's capital is dot, dot, dot.

00:50:32.180 --> 00:50:34.910
Now, you should think that this
is relatively straightforward.

00:50:34.910 --> 00:50:37.480
It's like just handing you
a softball type question.

00:50:37.480 --> 00:50:41.290
But historically within the
world of AI, this word, state,

00:50:41.290 --> 00:50:44.907
was so relatively far
away from the proper noun

00:50:44.907 --> 00:50:46.990
that it's actually referring
back to, that we just

00:50:46.990 --> 00:50:50.170
didn't have computational models
that took in that holistic picture,

00:50:50.170 --> 00:50:52.702
that frankly, we humans
are much better at.

00:50:52.702 --> 00:50:54.910
If you would ask this question
a little more quickly,

00:50:54.910 --> 00:50:57.260
a little more immediately, you
might have gotten a better response.

00:50:57.260 --> 00:50:59.610
But this is, daresay, why
chatbots in the past been

00:50:59.610 --> 00:51:01.945
so bad in the form of
customer service and the like,

00:51:01.945 --> 00:51:04.320
because they're not really
taking all of the context into

00:51:04.320 --> 00:51:07.470
account that we humans might
be inclined to provide.

00:51:07.470 --> 00:51:09.750
What's going on underneath the hood?

00:51:09.750 --> 00:51:14.220
Without escalating things too quickly,
what an artificial intelligence

00:51:14.220 --> 00:51:16.650
nowadays, these large
language models might do,

00:51:16.650 --> 00:51:21.360
is break down the user's
input, your input into ChatGPT

00:51:21.360 --> 00:51:22.950
into the individual words.

00:51:22.950 --> 00:51:26.790
We might then encode, we might then take
into account the order of those words.

00:51:26.790 --> 00:51:29.400
Massachusetts is first, is is last.

00:51:29.400 --> 00:51:33.050
We might further encode each of
those words using a standard way.

00:51:33.050 --> 00:51:34.800
And there's different
algorithms for this,

00:51:34.800 --> 00:51:37.050
but you come up with what
are called embeddings.

00:51:37.050 --> 00:51:40.170
That is to say, you can
use one of those APIs

00:51:40.170 --> 00:51:43.500
I talked about earlier, or even
software running on your own computers,

00:51:43.500 --> 00:51:46.140
to come up with a
mathematical representation

00:51:46.140 --> 00:51:47.940
of the word, Massachusetts.

00:51:47.940 --> 00:51:50.190
And Rongxin kindly did
this for us last night.

00:51:50.190 --> 00:51:57.000
This is the 1,536 floating
point values that OpenAI uses

00:51:57.000 --> 00:51:59.880
to represent the word, Massachusetts.

00:51:59.880 --> 00:52:02.010
And this is to say, and
you should not understand

00:52:02.010 --> 00:52:04.380
anything you are looking
at on the screen, nor do I,

00:52:04.380 --> 00:52:07.170
but this is now a
mathematical representation

00:52:07.170 --> 00:52:10.320
of the input that can
be compared against

00:52:10.320 --> 00:52:12.660
the mathematical
representations of other inputs

00:52:12.660 --> 00:52:15.420
in order to find proximity semantically.

00:52:15.420 --> 00:52:20.130
Words that somehow have relationships
or correlations with each other

00:52:20.130 --> 00:52:22.890
that helps the AI
ultimately predict what

00:52:22.890 --> 00:52:25.990
should the next word out of
its mouth be, so to speak.

00:52:25.990 --> 00:52:28.380
So in a case like,
these values represent--

00:52:28.380 --> 00:52:30.630
these lines represent all
of those attention values.

00:52:30.630 --> 00:52:32.880
And thicker lines means
there's more attention given

00:52:32.880 --> 00:52:34.140
from one word to another.

00:52:34.140 --> 00:52:35.730
Thinner lines mean the opposite.

00:52:35.730 --> 00:52:40.770
And those inputs are ultimately
fed into a large neural network,

00:52:40.770 --> 00:52:43.870
where you have inputs on the
left, outputs on the right.

00:52:43.870 --> 00:52:46.380
And in this particular
case, the hope is to get out

00:52:46.380 --> 00:52:52.200
a single word, which is the capital
of Boston itself, whereby somehow,

00:52:52.200 --> 00:52:55.950
the neural network and the humans
behind it at OpenAI, Microsoft, Google,

00:52:55.950 --> 00:52:59.490
or elsewhere, have sort of crunched
so many numbers by training

00:52:59.490 --> 00:53:03.040
these models on so much data, that it
figured out what all of those weights

00:53:03.040 --> 00:53:06.670
are, what the biases are, so
as to influence mathematically

00:53:06.670 --> 00:53:08.710
the output therefrom.

00:53:08.710 --> 00:53:13.270
So that is all underneath
the hood of what students now

00:53:13.270 --> 00:53:15.460
perceive as this adorable rubber duck.

00:53:15.460 --> 00:53:20.150
But underneath it all is certainly
a lot of domain knowledge.

00:53:20.150 --> 00:53:23.570
And CS50, by nature of being
OpenCourseWare for the past many years,

00:53:23.570 --> 00:53:26.050
CS50 is fortunate to actually
be part of the model,

00:53:26.050 --> 00:53:28.880
as might be any other content
that's freely available online.

00:53:28.880 --> 00:53:31.570
And so that certainly
helps benefit the answers

00:53:31.570 --> 00:53:34.150
when it comes to asking
CS50 specific questions.

00:53:34.150 --> 00:53:36.403
That said, it's not perfect.

00:53:36.403 --> 00:53:38.320
And you might have heard
of what are currently

00:53:38.320 --> 00:53:43.540
called hallucinations, where ChatGPT
and similar tools just make stuff up.

00:53:43.540 --> 00:53:45.340
And it sounds very confident.

00:53:45.340 --> 00:53:47.673
And you can sometimes
call on it, whereby

00:53:47.673 --> 00:53:49.090
you can say, no, that's not right.

00:53:49.090 --> 00:53:51.610
And it will playfully apologize
and say, oh, I'm sorry.

00:53:51.610 --> 00:53:56.560
But it made up some statement,
because it was probabilistically

00:53:56.560 --> 00:53:59.840
something that could be said,
even if it's just not correct.

00:53:59.840 --> 00:54:02.650
Now, allow me to propose
that this kind of problem

00:54:02.650 --> 00:54:05.230
is going to get less and less frequent.

00:54:05.230 --> 00:54:07.480
And so as the models evolve
and our techniques evolve,

00:54:07.480 --> 00:54:08.983
this will be less of an issue.

00:54:08.983 --> 00:54:10.900
But I thought it would
be fun to end on a note

00:54:10.900 --> 00:54:13.510
that a former colleague shared
just the other day, which

00:54:13.510 --> 00:54:16.780
was this old poem by Shel
Silverstein, another something

00:54:16.780 --> 00:54:18.580
from our past childhood perhaps.

00:54:18.580 --> 00:54:23.800
And this was from 1981, a poem called
"Homework Machine," which is perhaps

00:54:23.800 --> 00:54:26.980
foretold where we are now in 2023.

00:54:26.980 --> 00:54:30.940
"The homework machine, oh, the homework
machine, most perfect contraption

00:54:30.940 --> 00:54:32.320
that's ever been seen.

00:54:32.320 --> 00:54:35.770
Just put in your homework, then
drop in a dime, snap on the switch,

00:54:35.770 --> 00:54:41.380
and in ten seconds time, your homework
comes out quick and clean as can be.

00:54:41.380 --> 00:54:46.240
Here it is, 9 plus 4,
and the answer is 3.

00:54:46.240 --> 00:54:47.590
3?

00:54:47.590 --> 00:54:48.820
Oh, me.

00:54:48.820 --> 00:54:52.210
I guess it's not as perfect
as I thought it would be."

00:54:52.210 --> 00:54:55.330
So, quite foretelling, sure.

00:54:55.330 --> 00:54:58.220
[APPLAUSE]

00:54:58.220 --> 00:55:01.130
Quite foretelling, indeed.

00:55:01.130 --> 00:55:04.910
Though, if for all this and more,
the family members in the audience

00:55:04.910 --> 00:55:08.810
are welcome to take CS50
yourself online at cs50edx.org.

00:55:08.810 --> 00:55:10.700
For all of today and
so much more, allow me

00:55:10.700 --> 00:55:15.140
to thank Brian, Rongxin, Sophie, Andrew,
Patrick, Charlie, CS50's whole team.

00:55:15.140 --> 00:55:18.920
If you are a family member here
headed to lunch with CS50's team,

00:55:18.920 --> 00:55:22.190
please look for Cameron holding
a rubber duck above her head.

00:55:22.190 --> 00:55:24.300
Thank you so much for joining us today.

00:55:24.300 --> 00:55:25.670
This was CS50.

00:55:25.670 --> 00:55:27.170
[APPLAUSE]

00:55:27.170 --> 00:55:30.520
[MUSIC PLAYING]