1
00:00:00,000 --> 00:00:01,530
SPEAKER 1: All right.

2
00:00:01,530 --> 00:00:03,425
Well, this is a CS50 Tech Talk.

3
00:00:03,425 --> 00:00:04,800
Thank you all so much for coming.

4
00:00:04,800 --> 00:00:06,930
So about a week ago, we
circulated the Google Form,

5
00:00:06,930 --> 00:00:09,570
as you might have seen, at 10:52 AM.

6
00:00:09,570 --> 00:00:12,100
And by, like, 11:52
AM, we had 100 RSVP's,

7
00:00:12,100 --> 00:00:14,850
which I think is sort of testament
to just how much interest there

8
00:00:14,850 --> 00:00:19,050
is in this world of AI and OpenAI
and GPT, ChatGPT and the like.

9
00:00:19,050 --> 00:00:22,090
And in fact, if you're sort of
generally familiar with what everyone's

10
00:00:22,090 --> 00:00:24,090
talking about but you
haven't tried it yourself,

11
00:00:24,090 --> 00:00:27,300
like, this is the URL at which you can
try out this tool that you've probably

12
00:00:27,300 --> 00:00:28,880
heard about, ChatGPT.

13
00:00:28,880 --> 00:00:31,380
You can sign up for a free
account there and start tinkering

14
00:00:31,380 --> 00:00:33,510
with what everyone else
has been tinkering with.

15
00:00:33,510 --> 00:00:36,390
And then if you're more of the
app-minded type, which you probably

16
00:00:36,390 --> 00:00:39,720
are if you are here with us
today, OpenAI, in particular,

17
00:00:39,720 --> 00:00:43,230
has its own low-level APIs
via which you can integrate AI

18
00:00:43,230 --> 00:00:44,460
into your own software.

19
00:00:44,460 --> 00:00:46,920
But of course, as is the
case in computer science,

20
00:00:46,920 --> 00:00:49,020
there's all the more
abstractions and services

21
00:00:49,020 --> 00:00:51,180
that have been built on
top of these technologies.

22
00:00:51,180 --> 00:00:53,790
And we're so happy today
to be joined by our friends

23
00:00:53,790 --> 00:00:57,240
from McGill University and
Steamship, Sil and Ted,

24
00:00:57,240 --> 00:01:00,240
from whom you'll hear in just a
moment, to speak to us about how

25
00:01:00,240 --> 00:01:04,428
they are making it easier to build,
to deploy, to share applications using

26
00:01:04,428 --> 00:01:05,970
some of these very same technologies.

27
00:01:05,970 --> 00:01:09,150
So our thanks to them for hosting
today, our friends at Plympton,

28
00:01:09,150 --> 00:01:11,160
Jenny Lee, an alumna
who's here with us today.

29
00:01:11,160 --> 00:01:14,035
But without further ado, allow me
to turn things over to Ted and Sil.

30
00:01:14,035 --> 00:01:18,000
And pizza will be served
shortly after 1:00 PM outside.

31
00:01:18,000 --> 00:01:19,740
All right, over to you, Ted.

32
00:01:19,740 --> 00:01:21,375
TED BENSON: Thanks a lot.

33
00:01:21,375 --> 00:01:22,000
Hey, everybody.

34
00:01:22,000 --> 00:01:23,440
It's great to be here.

35
00:01:23,440 --> 00:01:25,830
I think we've got a really
good talk for you today.

36
00:01:25,830 --> 00:01:29,160
Sil is going to provide some research
grounding into how it all works,

37
00:01:29,160 --> 00:01:33,742
what's going inside the brain of GPT,
as well as other language models.

38
00:01:33,742 --> 00:01:35,700
And then I'll show you
some examples that we're

39
00:01:35,700 --> 00:01:38,250
seeing on the ground of how
people are building apps

40
00:01:38,250 --> 00:01:40,360
and what apps tend to
work in the real world.

41
00:01:40,360 --> 00:01:43,212
So our perspective is we're
building AWS for AI apps.

42
00:01:43,212 --> 00:01:46,170
So we get to talk to a lot of the
makers who are building and deploying

43
00:01:46,170 --> 00:01:49,710
their apps, and through that, see both
the experimental end of the spectrum

44
00:01:49,710 --> 00:01:52,230
and also see what kinds
of apps are getting

45
00:01:52,230 --> 00:01:55,440
pushed out there and turned into
companies, turned into side projects.

46
00:01:55,440 --> 00:01:59,130
We did a cool hackathon yesterday.

47
00:01:59,130 --> 00:02:02,393
Many thanks to Neiman, to David
Malan and CS50 for helping us

48
00:02:02,393 --> 00:02:04,560
put all of this together,
to Harvard for hosting it.

49
00:02:04,560 --> 00:02:06,870
And there were two sessions.

50
00:02:06,870 --> 00:02:08,020
Lots of folks built things.

51
00:02:08,020 --> 00:02:12,177
If you go to steamship.com/hackathon,
you'll find a lot of guides,

52
00:02:12,177 --> 00:02:14,760
a lot of projects that people
built. And you can follow along.

53
00:02:14,760 --> 00:02:16,470
We have a text guide, as well.

54
00:02:16,470 --> 00:02:21,310
Just as a quick plug for that, if you
want to do it remotely or on your own.

55
00:02:21,310 --> 00:02:24,810
So to tee up Sil, we're going to
talk about basically two things

56
00:02:24,810 --> 00:02:28,980
today that I hope you'll walk away
with and really know how to then use

57
00:02:28,980 --> 00:02:30,660
as you develop and as you tinker.

58
00:02:30,660 --> 00:02:33,150
One is what is GPT
and how is it working,

59
00:02:33,150 --> 00:02:35,250
get a good sense of what's
going on inside of it,

60
00:02:35,250 --> 00:02:38,340
other than as just this magical
machine that predicts things.

61
00:02:38,340 --> 00:02:41,410
And then two is how are people building
with it, and then, importantly,

62
00:02:41,410 --> 00:02:43,680
how can I build with it,
too, if you are a developer.

63
00:02:43,680 --> 00:02:45,638
And if you have CS50
background, you should

64
00:02:45,638 --> 00:02:48,180
be able to pick things up and
start building some great apps.

65
00:02:48,180 --> 00:02:50,100
I've already met some of
the CS50 grads yesterday,

66
00:02:50,100 --> 00:02:51,940
and the things that they were
doing were pretty amazing.

67
00:02:51,940 --> 00:02:53,080
So I hope this is useful.

68
00:02:53,080 --> 00:02:55,830
I'm going to kick it over
to Sil and talk about some

69
00:02:55,830 --> 00:02:58,698
of the theoretical background of GPT.

70
00:02:58,698 --> 00:02:59,490
SIL HAMILTON: Yeah.

71
00:02:59,490 --> 00:03:01,020
So thank you, Ted.

72
00:03:01,020 --> 00:03:01,740
My name is Sil.

73
00:03:01,740 --> 00:03:04,410
I'm a graduate student in the
digital humanities at McGill.

74
00:03:04,410 --> 00:03:07,947
I study literature and computer science
and linguistics in the same breath,

75
00:03:07,947 --> 00:03:10,530
and I've published some research
over the last couple of years

76
00:03:10,530 --> 00:03:15,360
exploring what is possible with language
models and culture, in particular.

77
00:03:15,360 --> 00:03:19,740
And my half, or whatever,
of the presentation

78
00:03:19,740 --> 00:03:21,780
is to describe to you what is GPT.

79
00:03:21,780 --> 00:03:24,060
That's really difficult
to explain in 15 minutes,

80
00:03:24,060 --> 00:03:26,580
and there are even a lot of
things that we don't know.

81
00:03:26,580 --> 00:03:28,470
But a good way to
approach that is the first

82
00:03:28,470 --> 00:03:33,120
consider all the things that
people call GPT by, or descriptors.

83
00:03:33,120 --> 00:03:35,580
So you can call them
large language models.

84
00:03:35,580 --> 00:03:37,800
You can call them
universal approximators.

85
00:03:37,800 --> 00:03:42,270
From computer science, you can
say that it is a generative AI.

86
00:03:42,270 --> 00:03:44,110
We know that they are neural networks.

87
00:03:44,110 --> 00:03:46,320
We know that it is an
artificial intelligence.

88
00:03:46,320 --> 00:03:48,120
To some, it's a simulator of culture.

89
00:03:48,120 --> 00:03:50,100
To others, it just predicts text.

90
00:03:50,100 --> 00:03:51,450
It's also a writing assistant.

91
00:03:51,450 --> 00:03:54,180
If you've ever used ChatGPT, you
can plug in a bit of your essay,

92
00:03:54,180 --> 00:03:54,990
get some feedback.

93
00:03:54,990 --> 00:03:56,430
It's amazing for that.

94
00:03:56,430 --> 00:03:57,540
It's a content generator.

95
00:03:57,540 --> 00:04:02,040
People use it to do copywriting with
Jasper.ai, Sudowrite, et cetera.

96
00:04:02,040 --> 00:04:03,323
It's an agent.

97
00:04:03,323 --> 00:04:05,490
So the really hot thing
right now, if you might have

98
00:04:05,490 --> 00:04:08,370
seen on Twitter, AutoGPT, Baby AGI.

99
00:04:08,370 --> 00:04:10,890
People are giving these
things tools and letting

100
00:04:10,890 --> 00:04:15,090
them run a little bit free in the wild
to interact with the world, computers,

101
00:04:15,090 --> 00:04:16,140
et cetera.

102
00:04:16,140 --> 00:04:18,029
We use them as chat bots, obviously.

103
00:04:18,029 --> 00:04:21,810
And the actual architecture
is a transformer.

104
00:04:21,810 --> 00:04:25,770
So there's lots of ways to
describe GPT, and any one of them

105
00:04:25,770 --> 00:04:29,430
is a really perfectly adequate
way to begin the conversation.

106
00:04:29,430 --> 00:04:32,430
But for our purposes, we can think
of it as a large language model,

107
00:04:32,430 --> 00:04:34,710
and more specifically, a language model.

108
00:04:34,710 --> 00:04:38,975
And a language model
is a model of language,

109
00:04:38,975 --> 00:04:40,350
if you'll allow me the tautology.

110
00:04:40,350 --> 00:04:42,558
But really, what it does is
it produces a probability

111
00:04:42,558 --> 00:04:44,650
distribution over some vocabulary.

112
00:04:44,650 --> 00:04:48,570
So let us imagine that we
had the task of predicting

113
00:04:48,570 --> 00:04:51,000
the next word of the sequence "I am."

114
00:04:51,000 --> 00:04:57,810
So if I give a neural network the words
"I am," what, of all words in English,

115
00:04:57,810 --> 00:04:59,990
is the next most likely word to follow?

116
00:04:59,990 --> 00:05:04,520
That, at its very core, is
what GPT is trained to answer.

117
00:05:04,520 --> 00:05:08,480
And how it does it is it has
a vocabulary of 50,000 words,

118
00:05:08,480 --> 00:05:12,060
and it knows roughly,
given the entire internet,

119
00:05:12,060 --> 00:05:17,990
which words are likely to follow other
words of those 50,000 in some sequence,

120
00:05:17,990 --> 00:05:23,540
up to 2,000 words, up to 4,000, up to
8,000, and now up to 32,000 in GPT-4.

121
00:05:23,540 --> 00:05:24,920
So you give it a sequence.

122
00:05:24,920 --> 00:05:26,060
Here, "I am."

123
00:05:26,060 --> 00:05:29,330
And over the vocabulary
of 50,000 words, it

124
00:05:29,330 --> 00:05:32,640
gives you the likelihood of
every single word that follows.

125
00:05:32,640 --> 00:05:34,340
So here, it's "I am."

126
00:05:34,340 --> 00:05:36,838
Perhaps the word "happy"
is fairly frequent,

127
00:05:36,838 --> 00:05:38,630
so we'll give that a
high probability if we

128
00:05:38,630 --> 00:05:41,480
look at all words, all
utterances of English.

129
00:05:41,480 --> 00:05:42,710
It might be "I am sad."

130
00:05:42,710 --> 00:05:44,900
Maybe that's a little bit less probable.

131
00:05:44,900 --> 00:05:45,950
"I am school."

132
00:05:45,950 --> 00:05:47,270
That really should be at
the end because I don't

133
00:05:47,270 --> 00:05:48,687
think anybody would ever say that.

134
00:05:48,687 --> 00:05:49,460
"I am Bjork."

135
00:05:49,460 --> 00:05:50,630
That's a little bit--

136
00:05:50,630 --> 00:05:52,130
it's not very probable.

137
00:05:52,130 --> 00:05:54,320
It's less probable than
happy/sad, but there's still

138
00:05:54,320 --> 00:05:55,768
some probability attached to it.

139
00:05:55,768 --> 00:05:58,310
And when we say it's probable,
that's literally a percentage.

140
00:05:58,310 --> 00:06:03,005
That's, like, "happy" follows "I
am" maybe like 5% of the time.

141
00:06:03,005 --> 00:06:07,110
"Sad" follows "I am" maybe
2% of the time, or whatever.

142
00:06:07,110 --> 00:06:11,960
So for every word that
we give GPT, it tries

143
00:06:11,960 --> 00:06:15,590
to predict what the next
word is across 50,000 words.

144
00:06:15,590 --> 00:06:19,420
And it gives every single
one of those 50,000 words

145
00:06:19,420 --> 00:06:22,970
a number that reflects
how probable it is.

146
00:06:22,970 --> 00:06:27,390
And the really magical thing that
happens is you can generate new texts.

147
00:06:27,390 --> 00:06:31,130
So if you give GPT "I
am" and it predicts

148
00:06:31,130 --> 00:06:37,410
"happy" as being the most probable word
over 50,000, you can then append it to

149
00:06:37,410 --> 00:06:37,910
"I am."

150
00:06:37,910 --> 00:06:39,350
So now you say "I am happy."

151
00:06:39,350 --> 00:06:41,470
And you feed it into the model again.

152
00:06:41,470 --> 00:06:42,470
You sample another word.

153
00:06:42,470 --> 00:06:45,300
You feed it into the model again,
and again and again and again.

154
00:06:45,300 --> 00:06:48,950
And there's lots of different ways
that "I am happy," "I am sad" can go.

155
00:06:48,950 --> 00:06:51,650
And you add a little bit of
randomness, and all of a sudden,

156
00:06:51,650 --> 00:06:54,650
you have a language model that
can write essays, that can talk,

157
00:06:54,650 --> 00:06:57,740
and a whole lot of things,
which is really unexpected

158
00:06:57,740 --> 00:07:00,480
and something that we didn't
predict even five years ago.

159
00:07:00,480 --> 00:07:01,820
So this is all relevant.

160
00:07:01,820 --> 00:07:08,710
And if we move on, as we scale up the
model and we give it more compute,

161
00:07:08,710 --> 00:07:13,780
in 2012, AlexNet came out, and we
figured out we can give the model--

162
00:07:13,780 --> 00:07:15,160
we can run the model in GPUs.

163
00:07:15,160 --> 00:07:16,780
So we can speed up the process.

164
00:07:16,780 --> 00:07:18,970
We can give the model
lots of information

165
00:07:18,970 --> 00:07:21,850
downloaded from the internet, and
it learns more and more and more.

166
00:07:21,850 --> 00:07:24,730
And the probabilities
that it gives you get

167
00:07:24,730 --> 00:07:27,460
better as it sees more examples
of English on the internet.

168
00:07:27,460 --> 00:07:31,062
So we have to train the model
to be really large, really wide,

169
00:07:31,062 --> 00:07:33,020
and we have to train it
for a really long time.

170
00:07:33,020 --> 00:07:36,640
And as we do that, the model gets
more and more better and expressive

171
00:07:36,640 --> 00:07:39,700
and capable, and it also gets
a little bit intelligent,

172
00:07:39,700 --> 00:07:43,510
and for reasons we don't understand.

173
00:07:43,510 --> 00:07:47,470
But also, the issue is that because
it learns to replicate the internet,

174
00:07:47,470 --> 00:07:50,950
it knows how to speak in a lot
of different genres of text

175
00:07:50,950 --> 00:07:52,420
and a lot of different registers.

176
00:07:52,420 --> 00:07:54,592
If you begin the
conversation like, "ChatGPT,

177
00:07:54,592 --> 00:07:57,550
can you explain the moon landing to
a six-year-old in a few sentences,"

178
00:07:57,550 --> 00:07:58,660
GPT-3--

179
00:07:58,660 --> 00:08:02,860
this is an example drawn from the
InstructGPT paper from OpenAI--

180
00:08:02,860 --> 00:08:07,610
GPT-3 would have just been like,
"OK, so you're giving me an example,

181
00:08:07,610 --> 00:08:09,610
like explain the moon
landing to a six-year-old.

182
00:08:09,610 --> 00:08:11,470
I'm going to give you a
whole bunch of similar things

183
00:08:11,470 --> 00:08:13,700
because those seem very
likely to come in a sequence."

184
00:08:13,700 --> 00:08:16,450
It doesn't necessarily understand
that it's being asked a question

185
00:08:16,450 --> 00:08:18,220
and has to respond with an answer.

186
00:08:18,220 --> 00:08:23,800
GPT-3 did not have that apparatus that
interfaced for responding to questions.

187
00:08:23,800 --> 00:08:29,270
And the scientists at OpenAI
came up with a solution.

188
00:08:29,270 --> 00:08:33,080
And that's let's give it a whole bunch
of examples of question and answers

189
00:08:33,080 --> 00:08:35,419
such that we first train
it on the internet,

190
00:08:35,419 --> 00:08:38,480
and then we train it with a
host of questions and answers

191
00:08:38,480 --> 00:08:41,210
such that it has the
knowledge of the internet,

192
00:08:41,210 --> 00:08:44,000
but really knows that it has
to be answering questions.

193
00:08:44,000 --> 00:08:47,570
And that is when ChatGPT was born.

194
00:08:47,570 --> 00:08:50,510
And that's when it gained 100
million users in one month.

195
00:08:50,510 --> 00:08:53,300
I think it beat TikTok's record
of 20 million in one month.

196
00:08:53,300 --> 00:08:54,750
It was a huge thing.

197
00:08:54,750 --> 00:08:58,730
And for a lot of people, they went,
"oh, this thing is intelligent.

198
00:08:58,730 --> 00:09:00,590
I can ask it questions.

199
00:09:00,590 --> 00:09:01,400
It answers back.

200
00:09:01,400 --> 00:09:03,440
We can work together
to come to a solution."

201
00:09:03,440 --> 00:09:07,910
And that's because it's still predicting
words, it's still a language model,

202
00:09:07,910 --> 00:09:13,140
but it knows to predict words in the
framework of a question and answer.

203
00:09:13,140 --> 00:09:14,450
So that's what a prompt is.

204
00:09:14,450 --> 00:09:16,100
That's what instruction tuning is.

205
00:09:16,100 --> 00:09:17,510
That's a key word.

206
00:09:17,510 --> 00:09:22,510
That's what RLHF is, if you've ever
seen that acronym, reinforcement

207
00:09:22,510 --> 00:09:24,520
alignment with human feedback.

208
00:09:24,520 --> 00:09:28,900
And all of those combined means that
the models that are coming out today,

209
00:09:28,900 --> 00:09:31,630
the types of language predictors
that are coming out today

210
00:09:31,630 --> 00:09:34,000
work to operate in a Q&A form.

211
00:09:34,000 --> 00:09:38,480
GPT-4 exclusively only has
the aligned model available.

212
00:09:38,480 --> 00:09:41,950
And this is a really
great, solid foundation

213
00:09:41,950 --> 00:09:44,350
to build on because you
can do all sorts of things.

214
00:09:44,350 --> 00:09:46,520
You can ask it, "ChatGPT,
can you do this for me?

215
00:09:46,520 --> 00:09:47,520
Can you do that for me?"

216
00:09:47,520 --> 00:09:51,080
You might have seen that OpenAI has
allowed plug-in access to ChatGPT.

217
00:09:51,080 --> 00:09:52,630
So it can access Wolfram.

218
00:09:52,630 --> 00:09:53,730
It can search the web.

219
00:09:53,730 --> 00:09:56,170
It can do Instacart for you.

220
00:09:56,170 --> 00:09:57,970
It can look up recipes.

221
00:09:57,970 --> 00:10:02,560
Once the model knows that not
only it has to predict language,

222
00:10:02,560 --> 00:10:05,910
but that it has to solve a problem--

223
00:10:05,910 --> 00:10:09,380
and the problem here being give
me a good answer to my question--

224
00:10:09,380 --> 00:10:12,770
it's suddenly able to interface with
the world in a really solid way.

225
00:10:12,770 --> 00:10:15,050
And from there on,
there's been all sorts

226
00:10:15,050 --> 00:10:19,520
of tools that build on this
Q&A form that ChatGPT uses.

227
00:10:19,520 --> 00:10:21,650
You have AutoGPT.

228
00:10:21,650 --> 00:10:22,760
You have LangChain.

229
00:10:22,760 --> 00:10:25,770
You have React.

230
00:10:25,770 --> 00:10:28,080
There was a React paper where
a lot of these come from.

231
00:10:28,080 --> 00:10:34,530
And turning the model into an agent
with which to achieve any ambiguous goal

232
00:10:34,530 --> 00:10:36,000
is where the future is going.

233
00:10:36,000 --> 00:10:38,160
And this is all thanks
to instruction tuning.

234
00:10:38,160 --> 00:10:40,830
And with that, I think
I will hand it off

235
00:10:40,830 --> 00:10:44,550
to Ted, who will be giving a demo,
or something along those lines,

236
00:10:44,550 --> 00:10:48,615
for how to use GPT as an agent.

237
00:10:48,615 --> 00:10:49,115
So.

238
00:10:49,115 --> 00:10:51,930


239
00:10:51,930 --> 00:10:54,810
TED BENSON: All right, so
I'm a super applied guy.

240
00:10:54,810 --> 00:11:00,030
I kind of look at things and think, OK,
how can I add this LEGO, add that LEGO,

241
00:11:00,030 --> 00:11:02,410
and clip them together and
build something with it.

242
00:11:02,410 --> 00:11:06,930
And right now, if you look back
in computer science history, when

243
00:11:06,930 --> 00:11:10,110
you look at the kinds of things
that were being done in 1970,

244
00:11:10,110 --> 00:11:13,950
right after computing was invented,
the microprocessors were invented,

245
00:11:13,950 --> 00:11:17,070
people were doing research like
how do I sort a list of numbers.

246
00:11:17,070 --> 00:11:19,030
And that was meaningful
work, and importantly,

247
00:11:19,030 --> 00:11:20,780
it was work that's
accessible to everybody

248
00:11:20,780 --> 00:11:24,540
because nobody knows what we can
build with this new kind of oil,

249
00:11:24,540 --> 00:11:27,540
this new kind of electricity, this
new kind of unit of computation

250
00:11:27,540 --> 00:11:28,710
we've created.

251
00:11:28,710 --> 00:11:31,560
And anything was game, and
anybody could participate

252
00:11:31,560 --> 00:11:33,010
in that game to figure it out.

253
00:11:33,010 --> 00:11:37,140
And I think one of the really exciting
things about GPT right now is, yes,

254
00:11:37,140 --> 00:11:38,830
in and of itself, it's amazing.

255
00:11:38,830 --> 00:11:43,210
But then, what could we do with it
if we call it over and over again,

256
00:11:43,210 --> 00:11:45,083
if we build it into our
algorithms and start

257
00:11:45,083 --> 00:11:46,500
to build it into broader software?

258
00:11:46,500 --> 00:11:48,420
So the world really
is yours to figure out

259
00:11:48,420 --> 00:11:50,820
these fundamental questions
about what could you

260
00:11:50,820 --> 00:11:55,710
do if you could script computation
itself over and over again in the way

261
00:11:55,710 --> 00:11:56,850
that computers can do.

262
00:11:56,850 --> 00:11:59,340
Not just talk with it,
but build things atop it.

263
00:11:59,340 --> 00:12:00,960
So we're a hosting company.

264
00:12:00,960 --> 00:12:02,040
We host apps.

265
00:12:02,040 --> 00:12:04,335
And these are just some
of the things that we see.

266
00:12:04,335 --> 00:12:06,210
I'm going to show you
demos of this with code

267
00:12:06,210 --> 00:12:08,463
and try to explain some
of the thought process.

268
00:12:08,463 --> 00:12:10,380
But I wanted to give you
a high-level overview

269
00:12:10,380 --> 00:12:12,487
of you've probably
seen these on Twitter,

270
00:12:12,487 --> 00:12:15,570
but kind of when it all sorts out to
the top, these are some of the things

271
00:12:15,570 --> 00:12:19,470
that we're seeing built and
deployed with language models today.

272
00:12:19,470 --> 00:12:20,730
Companionship.

273
00:12:20,730 --> 00:12:23,790
That's everything from I need a friend
to I need a friend with a purpose.

274
00:12:23,790 --> 00:12:24,568
I want a coach.

275
00:12:24,568 --> 00:12:27,360
I want somebody to tell me, "go to
the gym and do these exercises."

276
00:12:27,360 --> 00:12:29,527
I want somebody to help me
study a foreign language.

277
00:12:29,527 --> 00:12:30,660
Question answering.

278
00:12:30,660 --> 00:12:31,510
This is a big one.

279
00:12:31,510 --> 00:12:34,350
This is everything from your
newsroom having a Slack bot that

280
00:12:34,350 --> 00:12:37,920
helps assist you, does
this article conform

281
00:12:37,920 --> 00:12:41,478
to the style guidelines of our
newsroom, all the way through to I

282
00:12:41,478 --> 00:12:43,770
need help on my homework, or
hey, I have some questions

283
00:12:43,770 --> 00:12:46,860
that I want you to ask Wikipedia,
combine it with something else,

284
00:12:46,860 --> 00:12:48,870
synthesize the answer,
and give it to me.

285
00:12:48,870 --> 00:12:50,400
Utility functions.

286
00:12:50,400 --> 00:12:54,750
I would describe this as there's
a large set of things for which

287
00:12:54,750 --> 00:12:57,510
human beings can do them if only--

288
00:12:57,510 --> 00:12:59,937
or computers could do them
if only they had access

289
00:12:59,937 --> 00:13:01,770
to language computation,
language knowledge.

290
00:13:01,770 --> 00:13:04,890
An example of this would be
read every tweet on Twitter.

291
00:13:04,890 --> 00:13:06,227
Tell me the ones I should read.

292
00:13:06,227 --> 00:13:09,060
That way, I only get to read the
ones that actually make sense to me

293
00:13:09,060 --> 00:13:10,810
and I don't have to
skim through the rest.

294
00:13:10,810 --> 00:13:11,730
Creativity.

295
00:13:11,730 --> 00:13:14,220
Image generation, text
generation, storytelling,

296
00:13:14,220 --> 00:13:16,020
proposing other ways to do things.

297
00:13:16,020 --> 00:13:19,650
And then these wild experiments
in kind of Baby AGI,

298
00:13:19,650 --> 00:13:23,485
as people are calling them, in which
the AI itself decides what to do

299
00:13:23,485 --> 00:13:24,360
and is self-directed.

300
00:13:24,360 --> 00:13:27,360
So I'll show you examples of many of
these and what the code looks like.

301
00:13:27,360 --> 00:13:29,700
And if I were you, I
would think about these

302
00:13:29,700 --> 00:13:33,990
as categories within which to both
think about what you might build

303
00:13:33,990 --> 00:13:37,620
and then also seek out
starter projects for how you

304
00:13:37,620 --> 00:13:39,270
might go about building them online.

305
00:13:39,270 --> 00:13:42,250


306
00:13:42,250 --> 00:13:42,750
All right.

307
00:13:42,750 --> 00:13:45,450
So I'm just going to dive straight
into demos and code for some of these

308
00:13:45,450 --> 00:13:48,540
because I know that's what's
interesting to see as fellow builders,

309
00:13:48,540 --> 00:13:51,520
with a high-level diagram for
some of these as to how it works.

310
00:13:51,520 --> 00:13:54,630
So approximately, you can
think of a companionship bot

311
00:13:54,630 --> 00:13:57,520
as a friend that has a purpose to you.

312
00:13:57,520 --> 00:14:00,070
And there are many ways to
build all of these things,

313
00:14:00,070 --> 00:14:02,250
but one of the ways you
can build this is simply

314
00:14:02,250 --> 00:14:06,930
to wrap GPT or a language model in
an endpoint that additionally injects

315
00:14:06,930 --> 00:14:10,620
into the prompt some particular
perspective or some particular goal

316
00:14:10,620 --> 00:14:11,820
that you want to use.

317
00:14:11,820 --> 00:14:15,030
It really is that easy, in a
way, but it's also very hard

318
00:14:15,030 --> 00:14:19,170
because you need to iterate and engineer
the prompt so that it consistently

319
00:14:19,170 --> 00:14:22,060
performs the way you want it to perform.

320
00:14:22,060 --> 00:14:25,090
So a good example of this is something
somebody built in the hackathon

321
00:14:25,090 --> 00:14:25,590
yesterday.

322
00:14:25,590 --> 00:14:28,007
And I just wanted to show you
the project that they built.

323
00:14:28,007 --> 00:14:29,550
It was a Mandarin idiom coach.

324
00:14:29,550 --> 00:14:31,800
And I'll show you what the
code looked like first.

325
00:14:31,800 --> 00:14:33,688
I'll show you the demo first.

326
00:14:33,688 --> 00:14:34,980
I think I already pulled it up.

327
00:14:34,980 --> 00:14:38,160


328
00:14:38,160 --> 00:14:39,680
Here we go.

329
00:14:39,680 --> 00:14:42,770
So the buddy that this
person wanted to create

330
00:14:42,770 --> 00:14:47,180
was a friend that, if you
gave it a particular problem

331
00:14:47,180 --> 00:14:49,910
you were having, it would
pick a Chinese idiom,

332
00:14:49,910 --> 00:14:53,300
a four-character chengyu that
described, poetically, like,

333
00:14:53,300 --> 00:14:55,907
here's a particular
way you could say this,

334
00:14:55,907 --> 00:14:58,490
and it would tell it to her, so
that the person who built this

335
00:14:58,490 --> 00:15:02,070
was studying Chinese and she
wanted to learn more about it.

336
00:15:02,070 --> 00:15:08,080
So I might say something
like, "I'm feeling very sad."

337
00:15:08,080 --> 00:15:10,430
And it would think a little bit.

338
00:15:10,430 --> 00:15:13,540
And if everything's up
and running, it will

339
00:15:13,540 --> 00:15:16,330
generate one of these
four-character phrases

340
00:15:16,330 --> 00:15:19,307
and it will respond
to it with an example.

341
00:15:19,307 --> 00:15:21,140
Now, I don't know if
this is correct or not.

342
00:15:21,140 --> 00:15:23,920
So if somebody can call me out
if this is actually incorrect,

343
00:15:23,920 --> 00:15:26,170
please call me out.

344
00:15:26,170 --> 00:15:28,420
And it will then finish up
with something encouraging,

345
00:15:28,420 --> 00:15:29,380
saying, "hey, you can do it.

346
00:15:29,380 --> 00:15:30,280
I know this is hard.

347
00:15:30,280 --> 00:15:30,910
Keep going."

348
00:15:30,910 --> 00:15:32,535
So let me show you how they built this.

349
00:15:32,535 --> 00:15:40,510
And I pulled up the code right here.

350
00:15:40,510 --> 00:15:44,910
So this was the
particular starter Replit

351
00:15:44,910 --> 00:15:47,430
that folks were using in
the hackathon yesterday.

352
00:15:47,430 --> 00:15:52,560
And we pulled things up into basically
you have a wrapper around GPT.

353
00:15:52,560 --> 00:15:54,690
And there's many things
you could do, but we're

354
00:15:54,690 --> 00:15:56,648
going to make it easy
for you to do two things.

355
00:15:56,648 --> 00:15:59,940
One of them is to inject some
personality into the prompt.

356
00:15:59,940 --> 00:16:02,590
And I'll explain what that
prompt is in a second.

357
00:16:02,590 --> 00:16:04,440
And then the second is
add tools that might

358
00:16:04,440 --> 00:16:08,190
go out and do a particular thing--
search the web or generate an image

359
00:16:08,190 --> 00:16:11,710
or add something to a database or
fetch something from a database.

360
00:16:11,710 --> 00:16:15,210
So having done that, now you
have something more than GPT.

361
00:16:15,210 --> 00:16:19,030
Now you have GPT, which we all know what
it is and how we can interact with it,

362
00:16:19,030 --> 00:16:22,110
but you've also added a particular
lens through which it's talking

363
00:16:22,110 --> 00:16:23,610
to you and, potentially, some tools.

364
00:16:23,610 --> 00:16:30,250
So this particular Chinese tutor, all
it took to build that was four lines.

365
00:16:30,250 --> 00:16:33,510
So here's a question that I think
is frying the minds of everybody

366
00:16:33,510 --> 00:16:35,700
in the industry right now.

367
00:16:35,700 --> 00:16:38,260
So is this something that
we'll all do casually?

368
00:16:38,260 --> 00:16:39,260
And nobody really knows.

369
00:16:39,260 --> 00:16:42,177
Will we just all say in the future
to the LLM, "hey, for the next five

370
00:16:42,177 --> 00:16:43,760
minutes, please talk like a teacher?"

371
00:16:43,760 --> 00:16:45,000
Maybe.

372
00:16:45,000 --> 00:16:48,120
But also, definitely in the
meantime and maybe in the future,

373
00:16:48,120 --> 00:16:51,120
it makes sense to wrap up
these personalized endpoints

374
00:16:51,120 --> 00:16:53,910
so that when I'm talking to GPT,
I'm not just talking to GPT.

375
00:16:53,910 --> 00:16:55,710
I have a whole army
of different buddies,

376
00:16:55,710 --> 00:16:58,082
of different companions
that I can talk to.

377
00:16:58,082 --> 00:17:00,540
They're kind of human and kind
of talk to me interactively,

378
00:17:00,540 --> 00:17:04,812
but because I preloaded them with,
"hey, by the way, you particular,

379
00:17:04,812 --> 00:17:07,020
I want you to be a kind,
helpful Chinese teacher that

380
00:17:07,020 --> 00:17:10,140
responds to every situation by
explaining the chengyu that fits it.

381
00:17:10,140 --> 00:17:12,630
Speak in English and explain
the chengyu and its meaning.

382
00:17:12,630 --> 00:17:15,690
Then provide a note of encouragement
about learning language."

383
00:17:15,690 --> 00:17:19,770
And so just adding something like
that, even if you're a non-programmer,

384
00:17:19,770 --> 00:17:26,339
you can just type deploy and
it'll pop it up to the web.

385
00:17:26,339 --> 00:17:29,580
It'll take it over to a Telegram
bot that you can even interact with.

386
00:17:29,580 --> 00:17:33,250
"Hey, I'm feeling too busy."

387
00:17:33,250 --> 00:17:35,830
And interact with it over
Telegram, over the web.

388
00:17:35,830 --> 00:17:41,500
And this is the kind of thing that's now
within reach for everybody from a CS101

389
00:17:41,500 --> 00:17:44,320
grad, so I'm using the
general purpose framing,

390
00:17:44,320 --> 00:17:46,810
all the way through to
professionals in the industry

391
00:17:46,810 --> 00:17:49,600
that you can do just with a
little bit of manipulation

392
00:17:49,600 --> 00:17:57,020
on top of this raw unit of
conversation and intelligence.

393
00:17:57,020 --> 00:18:04,840
So companionship is one of the first
common types of apps that we're seeing.

394
00:18:04,840 --> 00:18:09,090
So a second kind of app that
we're seeing-- and this blew up--

395
00:18:09,090 --> 00:18:12,690
for those of you who are
kind of Twitter followers,

396
00:18:12,690 --> 00:18:16,612
this blew up I think the last few
months, is question answering.

397
00:18:16,612 --> 00:18:18,570
And I want to unpack a
couple of different ways

398
00:18:18,570 --> 00:18:21,990
this can work because I know many
of you have probably already tried

399
00:18:21,990 --> 00:18:23,730
to build some of these kinds of apps.

400
00:18:23,730 --> 00:18:25,772
There's a couple of
different ways that it works.

401
00:18:25,772 --> 00:18:29,468
The general framework
is a user queries GPT.

402
00:18:29,468 --> 00:18:31,260
And maybe it has
general-purpose knowledge.

403
00:18:31,260 --> 00:18:33,260
Maybe it doesn't have
general-purpose knowledge.

404
00:18:33,260 --> 00:18:37,950
But what you want it to say back to you
is something specific about an article

405
00:18:37,950 --> 00:18:41,430
you wrote, or something specific
about your course syllabus,

406
00:18:41,430 --> 00:18:45,120
or something specific about a particular
set of documents from the United

407
00:18:45,120 --> 00:18:46,770
Nations on a particular topic.

408
00:18:46,770 --> 00:18:49,020
And so what you're really
seeking is what we all hoped

409
00:18:49,020 --> 00:18:50,430
the customer service bot would be.

410
00:18:50,430 --> 00:18:52,590
Like, we've all interacted with
these customer service bots,

411
00:18:52,590 --> 00:18:55,620
and we're kind of smashing our
heads on the keyboard as we do it.

412
00:18:55,620 --> 00:18:59,430
But pretty soon, we're going to
start to see very high-fidelity

413
00:18:59,430 --> 00:19:01,320
bots that interact with us comfortably.

414
00:19:01,320 --> 00:19:03,570
And this is approximately
how to do it as an engineer.

415
00:19:03,570 --> 00:19:05,790
So here's your game plan as an engineer.

416
00:19:05,790 --> 00:19:11,690
Step one, take the documents
that you want it to respond to.

417
00:19:11,690 --> 00:19:13,580
Step two, cut them up.

418
00:19:13,580 --> 00:19:15,950
Now, if you're an engineer,
this is going to madden you.

419
00:19:15,950 --> 00:19:18,440
You don't cut them up in
a way that you would hope.

420
00:19:18,440 --> 00:19:21,410
For example, you could cut
them up into clean sentences

421
00:19:21,410 --> 00:19:24,590
or clean paragraphs or
semantically coherent sections.

422
00:19:24,590 --> 00:19:26,210
And that would be really nice.

423
00:19:26,210 --> 00:19:28,370
Honestly, the way that
most folks do it--

424
00:19:28,370 --> 00:19:31,850
and this is a simplification
that tends to be just fine--

425
00:19:31,850 --> 00:19:35,780
is you window, you have a sliding
window that goes over the document,

426
00:19:35,780 --> 00:19:38,780
and you just pull out fragments of text.

427
00:19:38,780 --> 00:19:40,647
Having pulled out those
fragments of text,

428
00:19:40,647 --> 00:19:42,980
you turn them into something
called an embedding vector.

429
00:19:42,980 --> 00:19:45,890
So an embedding vector
is a list of numbers

430
00:19:45,890 --> 00:19:49,230
that approximate some point of meaning.

431
00:19:49,230 --> 00:19:51,800
So you've already all dealt
with embedding vectors yourself

432
00:19:51,800 --> 00:19:52,475
in regular life.

433
00:19:52,475 --> 00:19:54,350
And the reason you have,
and I know you have,

434
00:19:54,350 --> 00:19:57,060
is because everybody's
ordered food from Yelp before.

435
00:19:57,060 --> 00:20:01,280
So when you order food from Yelp, you
look at what genre of restaurant is it.

436
00:20:01,280 --> 00:20:02,630
Is it a pizza restaurant?

437
00:20:02,630 --> 00:20:03,870
Is it an Italian restaurant?

438
00:20:03,870 --> 00:20:05,330
Is it a Korean barbecue place?

439
00:20:05,330 --> 00:20:08,720
You look at how many stars does it
have-- one, two, three, four, five.

440
00:20:08,720 --> 00:20:10,020
You look at where is it.

441
00:20:10,020 --> 00:20:14,000
So all of these you can think of as
points in space, dimensions in space.

442
00:20:14,000 --> 00:20:17,090
Korean barbecue restaurant,
four stars, near my house.

443
00:20:17,090 --> 00:20:20,680
It's a three-number vector.

444
00:20:20,680 --> 00:20:21,700
That's all this is.

445
00:20:21,700 --> 00:20:24,760
So this is a 1,000 number vector
or a 10,000 number vector.

446
00:20:24,760 --> 00:20:26,920
Different models produce
different size vectors.

447
00:20:26,920 --> 00:20:30,298
All it is, is chunking
pieces of text, turning it

448
00:20:30,298 --> 00:20:33,340
into a vector that approximates meaning,
and then you put it in something

449
00:20:33,340 --> 00:20:34,382
called a vector database.

450
00:20:34,382 --> 00:20:38,380
And a vector database is just
a database that stores numbers.

451
00:20:38,380 --> 00:20:43,570
But having that database, now when I ask
a question, I can search the database

452
00:20:43,570 --> 00:20:47,200
and I can say, "hey, the question
was, what does CS50 teach?"

453
00:20:47,200 --> 00:20:54,180
What pieces of text in the database
have vectors similar to the question,

454
00:20:54,180 --> 00:20:55,860
what does CS50 teach?

455
00:20:55,860 --> 00:20:58,350
And there's all sorts
of tricks and empires

456
00:20:58,350 --> 00:21:01,440
being made on refinements
of this general approach.

457
00:21:01,440 --> 00:21:06,270
But at the end, you, the
developer, model it simply as thus.

458
00:21:06,270 --> 00:21:08,670
And then when you have
your query, you embed it,

459
00:21:08,670 --> 00:21:12,010
you find the document fragments,
and then you put them into a prompt.

460
00:21:12,010 --> 00:21:16,290
And now we're just back to the
personality, the companionship bot.

461
00:21:16,290 --> 00:21:17,640
Now it's just a prompt.

462
00:21:17,640 --> 00:21:20,860
And the prompt is, "you're an
expert in answering questions.

463
00:21:20,860 --> 00:21:25,800
Please answer user-provided
question using source documents,

464
00:21:25,800 --> 00:21:27,090
results from the database."

465
00:21:27,090 --> 00:21:28,493
That's it.

466
00:21:28,493 --> 00:21:31,410
So after all of these decades of
engineering of these customer service

467
00:21:31,410 --> 00:21:34,110
bots, it turns out, with a couple of
lines of code, you can build this.

468
00:21:34,110 --> 00:21:34,902
So let me show you.

469
00:21:34,902 --> 00:21:38,990
I made one just before the
class with the CS50 syllabus.

470
00:21:38,990 --> 00:21:43,770
So we can pull that up.

471
00:21:43,770 --> 00:21:46,980
And I can say I added
the PDF right here.

472
00:21:46,980 --> 00:21:48,500
So I just searched--

473
00:21:48,500 --> 00:21:51,260
I apologize, I don't know if it's
an accurate or recent syllabus.

474
00:21:51,260 --> 00:21:53,780
I just searched the web
for CS50 syllabus PDF.

475
00:21:53,780 --> 00:21:55,670
I put the URL in here.

476
00:21:55,670 --> 00:21:56,900
It loaded it into here.

477
00:21:56,900 --> 00:21:58,970
This is just like a
100-line piece of code

478
00:21:58,970 --> 00:22:02,150
deployed that will
now let me talk to it.

479
00:22:02,150 --> 00:22:07,480
And I can say, "what
will CS50 teach me?"

480
00:22:07,480 --> 00:22:10,510
So under the hood now, what's happening
is exactly what that slide just

481
00:22:10,510 --> 00:22:11,010
showed you.

482
00:22:11,010 --> 00:22:13,210
It takes that question,
"What will CS50 teach me."

483
00:22:13,210 --> 00:22:14,980
It turns it into a vector.

484
00:22:14,980 --> 00:22:18,700
That vector approximates,
without exactly representing,

485
00:22:18,700 --> 00:22:20,920
the meaning of that question.

486
00:22:20,920 --> 00:22:24,070
It looks into a vector
database that Steamship

487
00:22:24,070 --> 00:22:27,580
hosts of fragments from that PDF.

488
00:22:27,580 --> 00:22:30,100
And then it pulls out a
document and then passes it

489
00:22:30,100 --> 00:22:34,330
to a prompt that says, "hey, you're
an expert at answering questions.

490
00:22:34,330 --> 00:22:36,910
Someone has asked you
what does CS50 teach.

491
00:22:36,910 --> 00:22:40,840
Please answer it using only the
source documents and source materials

492
00:22:40,840 --> 00:22:41,780
I've provided."

493
00:22:41,780 --> 00:22:45,232
Now, those source materials are
dynamically loaded into the prompt.

494
00:22:45,232 --> 00:22:46,690
It's just basic prompt engineering.

495
00:22:46,690 --> 00:22:49,090
And I want to keep
harping back onto that.

496
00:22:49,090 --> 00:22:53,410
What's amazing about right now as
builders is that so many things just

497
00:22:53,410 --> 00:22:59,260
boil down to very creative,
tactical rearrangement of prompts,

498
00:22:59,260 --> 00:23:01,658
and then using those over and
over again in an algorithm

499
00:23:01,658 --> 00:23:02,950
and putting that into software.

500
00:23:02,950 --> 00:23:06,250
So the result-- and again, it could be
lying, it could be making things up,

501
00:23:06,250 --> 00:23:09,280
it could be hallucinating-- is "CS50
will teach students how to think

502
00:23:09,280 --> 00:23:10,840
algorithmically and solve
problems efficiently,

503
00:23:10,840 --> 00:23:13,210
focusing on topics such as
abstraction," da-da-da-da-da.

504
00:23:13,210 --> 00:23:16,390
And then it returns the source
document from which it was found.

505
00:23:16,390 --> 00:23:19,150
So this is another big
category of which there

506
00:23:19,150 --> 00:23:23,230
are tons of potential
applications because you

507
00:23:23,230 --> 00:23:25,110
can repeat for each context.

508
00:23:25,110 --> 00:23:27,790
You can create arbitrarily many
of these once it's software,

509
00:23:27,790 --> 00:23:31,610
because once it's software, you can
just repeat it over and over again.

510
00:23:31,610 --> 00:23:35,980
So for your dorm, for your club,
for your Slack, for your Telegram.

511
00:23:35,980 --> 00:23:38,950
You can start to begin
putting pieces of information

512
00:23:38,950 --> 00:23:40,610
in and then responding to it.

513
00:23:40,610 --> 00:23:42,110
And it doesn't have to be documents.

514
00:23:42,110 --> 00:23:46,140
You can also load it
straight into the prompt.

515
00:23:46,140 --> 00:23:47,680
I think I have it pulled up here.

516
00:23:47,680 --> 00:23:50,150
And if I don't, I'll just skip it.

517
00:23:50,150 --> 00:23:52,070
Oh, here we go.

518
00:23:52,070 --> 00:23:55,430
One other way you can
do question answering,

519
00:23:55,430 --> 00:23:57,650
because I think it's
healthy to always encourage

520
00:23:57,650 --> 00:24:00,470
the simplest possible
approach to something,

521
00:24:00,470 --> 00:24:03,000
you don't need to engineer
this giant system.

522
00:24:03,000 --> 00:24:04,250
It's great to have a database.

523
00:24:04,250 --> 00:24:05,300
It's great to use embeddings.

524
00:24:05,300 --> 00:24:06,560
It's great to use this big approach.

525
00:24:06,560 --> 00:24:07,060
It's fancy.

526
00:24:07,060 --> 00:24:07,730
It scales.

527
00:24:07,730 --> 00:24:09,410
You can do a lot of things.

528
00:24:09,410 --> 00:24:13,790
But you can also get away with a lot
by just pushing it all into a prompt.

529
00:24:13,790 --> 00:24:17,678
And as an engineer, [INAUDIBLE],, one
of our teammates here always says,

530
00:24:17,678 --> 00:24:19,220
"engineers should aspire to be lazy."

531
00:24:19,220 --> 00:24:20,825
And I couldn't agree more.

532
00:24:20,825 --> 00:24:23,660
You, as an engineer,
should want to set yourself

533
00:24:23,660 --> 00:24:27,260
up so that you can pursue
the lazy path to something.

534
00:24:27,260 --> 00:24:31,400
So here's how you might do the
equivalent of a question answering

535
00:24:31,400 --> 00:24:32,570
system with a prompt alone.

536
00:24:32,570 --> 00:24:35,060
Let's say you have 30 friends.

537
00:24:35,060 --> 00:24:37,070
And each friend is good
at a particular thing.

538
00:24:37,070 --> 00:24:40,400
Or you can-- this is isomorphic
to many other problems.

539
00:24:40,400 --> 00:24:42,920
You can simply just say,
"hey, I know certain things.

540
00:24:42,920 --> 00:24:44,660
Here's the things I know.

541
00:24:44,660 --> 00:24:47,330
A user is going to ask me something.

542
00:24:47,330 --> 00:24:48,980
How should we respond?"

543
00:24:48,980 --> 00:24:50,690
And then you load that into an agent.

544
00:24:50,690 --> 00:24:53,360
That agent has access to GPT.

545
00:24:53,360 --> 00:24:54,650
You can ship-deploy it.

546
00:24:54,650 --> 00:24:57,650
And now you've got a bot that
you can connect to Telegram,

547
00:24:57,650 --> 00:24:59,150
you can connect to Slack.

548
00:24:59,150 --> 00:25:01,565
And that bot, now, it
won't always give you

549
00:25:01,565 --> 00:25:03,440
the right answer, because
at a certain level,

550
00:25:03,440 --> 00:25:06,470
we can't control the variance
of the model underneath,

551
00:25:06,470 --> 00:25:10,220
but it will tend to answer
with respect to this list.

552
00:25:10,220 --> 00:25:13,490
And the degree to which it tends
is, to a certain extent, something

553
00:25:13,490 --> 00:25:16,400
that both industry is working
on to just give everybody

554
00:25:16,400 --> 00:25:19,700
as a capacity, but also you,
doing prompt engineering,

555
00:25:19,700 --> 00:25:23,360
to tighten up the error bars on it.

556
00:25:23,360 --> 00:25:26,910


557
00:25:26,910 --> 00:25:29,120
So I'll show you just
a few more examples.

558
00:25:29,120 --> 00:25:31,880
And then in about eight minutes,
I'll turn it over to questions,

559
00:25:31,880 --> 00:25:34,380
because I'm sure you've got a
lot about how to build things.

560
00:25:34,380 --> 00:25:36,830
So just to give you a
sense of where we are.

561
00:25:36,830 --> 00:25:40,990


562
00:25:40,990 --> 00:25:44,460
This is one, I don't have a demo for
you, but if you were to come to me

563
00:25:44,460 --> 00:25:48,060
and you were to say, "Ted, I
want a weekend hustle, man.

564
00:25:48,060 --> 00:25:49,410
What should I build?"

565
00:25:49,410 --> 00:25:50,970
Holy moly.

566
00:25:50,970 --> 00:25:55,050
There are a set of applications that
I would describe as utility functions.

567
00:25:55,050 --> 00:25:57,450
I don't like that name because
it doesn't sound exciting,

568
00:25:57,450 --> 00:25:59,010
and this is really exciting.

569
00:25:59,010 --> 00:26:02,520
And it's low-hanging fruits
that automate tasks that

570
00:26:02,520 --> 00:26:04,230
require basic language understanding.

571
00:26:04,230 --> 00:26:08,010
So examples for this are
generate a unit test.

572
00:26:08,010 --> 00:26:10,632
I don't know how many of you
have ever been writing tests

573
00:26:10,632 --> 00:26:12,090
and you're just like, "oh, come on.

574
00:26:12,090 --> 00:26:13,260
I can get through this.

575
00:26:13,260 --> 00:26:13,920
I can get through this."

576
00:26:13,920 --> 00:26:16,740
If you're a person who likes writing
tests, you're a lucky individual.

577
00:26:16,740 --> 00:26:19,490
Looking up the documentation for
a function, rewriting a function,

578
00:26:19,490 --> 00:26:23,460
making something conform to your
company guidelines, doing a brand check,

579
00:26:23,460 --> 00:26:25,860
all of these things are
things that are kind

580
00:26:25,860 --> 00:26:31,770
of relatively context-free operations,
or scoped-context operations

581
00:26:31,770 --> 00:26:36,170
on a piece of information that
requires linguistic understanding.

582
00:26:36,170 --> 00:26:39,530
And really, you can think
of them as something

583
00:26:39,530 --> 00:26:42,320
that is now available to
you as a software builder,

584
00:26:42,320 --> 00:26:45,980
as a weekend project builder,
as a startup builder.

585
00:26:45,980 --> 00:26:48,590
And you just have to build
the interface around it

586
00:26:48,590 --> 00:26:51,890
and present it to other people
in a context in which it's

587
00:26:51,890 --> 00:26:54,150
meaningful for them to consume.

588
00:26:54,150 --> 00:26:57,300
And so the space of
this is extraordinary.

589
00:26:57,300 --> 00:27:00,560
I mean, it's the space of all human
endeavor, now with this new tool,

590
00:27:00,560 --> 00:27:02,150
I think is the way to think about it.

591
00:27:02,150 --> 00:27:04,910
People often joke about how, when
you're building a company, when you're

592
00:27:04,910 --> 00:27:07,220
building a project, you don't
want to start with the hammer

593
00:27:07,220 --> 00:27:09,450
because you want to start
with the problem instead.

594
00:27:09,450 --> 00:27:12,170
And it's generally
true, but my God, we've

595
00:27:12,170 --> 00:27:13,640
just got a really cool new hammer.

596
00:27:13,640 --> 00:27:16,518
And to a certain extent, I would
encourage you to at least casually,

597
00:27:16,518 --> 00:27:18,560
on the weekends, run around
and hit stuff with it

598
00:27:18,560 --> 00:27:21,320
and see what can happen from a
builder's, from a tinkerer's,

599
00:27:21,320 --> 00:27:23,735
from an experimentalist's point of view.

600
00:27:23,735 --> 00:27:27,790


601
00:27:27,790 --> 00:27:30,970
And then the final one is creativity.

602
00:27:30,970 --> 00:27:32,740
This is another huge mega app.

603
00:27:32,740 --> 00:27:36,040
Now, I primarily live in
the text world, and so I'm

604
00:27:36,040 --> 00:27:37,840
going to talk about text-based things.

605
00:27:37,840 --> 00:27:42,700
I think, so far, this has mostly
been growing in the imagery world

606
00:27:42,700 --> 00:27:45,610
because we're such visual creatures
and the images you can generate

607
00:27:45,610 --> 00:27:48,120
are just staggering with AI.

608
00:27:48,120 --> 00:27:49,870
It certainly brings
up a lot of questions,

609
00:27:49,870 --> 00:27:53,050
too, around IP and artistic style.

610
00:27:53,050 --> 00:27:57,548
But the template for this, if you're a
builder, that we're seeing in the wild

611
00:27:57,548 --> 00:27:58,840
is approximately the following.

612
00:27:58,840 --> 00:28:01,570
And the thing I want to point
out is domain knowledge here.

613
00:28:01,570 --> 00:28:03,278
This is really the
purpose of this slide,

614
00:28:03,278 --> 00:28:06,740
is to touch on the importance
of the domain knowledge.

615
00:28:06,740 --> 00:28:12,960
So many people approximately find
the creative process as follows.

616
00:28:12,960 --> 00:28:15,580
Come up with a big idea.

617
00:28:15,580 --> 00:28:18,370
Overgenerate possibilities.

618
00:28:18,370 --> 00:28:21,100
Edit down what you overgenerated.

619
00:28:21,100 --> 00:28:22,210
Repeat.

620
00:28:22,210 --> 00:28:22,870
Right?

621
00:28:22,870 --> 00:28:26,495
Anybody who's been a writer knows,
when you write, you write way too much,

622
00:28:26,495 --> 00:28:28,120
and then you have to delete lots of it.

623
00:28:28,120 --> 00:28:30,120
And then you revise, and
you write way too much,

624
00:28:30,120 --> 00:28:31,580
and you have to delete lots of it.

625
00:28:31,580 --> 00:28:34,540
This particular task
is fantastic for AI.

626
00:28:34,540 --> 00:28:36,868
One of the reasons it's
fantastic for AI is

627
00:28:36,868 --> 00:28:38,410
because it allows the AI to be wrong.

628
00:28:38,410 --> 00:28:40,730
You know, you've preagreed you're
going to delete lots of it.

629
00:28:40,730 --> 00:28:42,730
And so if you preagree,
"hey, I'm just going

630
00:28:42,730 --> 00:28:46,270
to generate five possibilities
of the story I might tell,

631
00:28:46,270 --> 00:28:48,430
five possibilities of
the advertising headline,

632
00:28:48,430 --> 00:28:52,512
five possibilities of what
I might write my thesis on,"

633
00:28:52,512 --> 00:28:54,970
you preagreed it's OK if it's
a little long because you are

634
00:28:54,970 --> 00:28:56,860
going to be the editor that steps in.

635
00:28:56,860 --> 00:28:59,890
And here's the thing that you
really should bring to the table,

636
00:28:59,890 --> 00:29:02,020
is don't think about this
as a technical activity.

637
00:29:02,020 --> 00:29:06,700
Think about this as your opportunity
not to put GPT in charge.

638
00:29:06,700 --> 00:29:09,910
Instead, for you to grasp
the steering wheel tighter--

639
00:29:09,910 --> 00:29:11,290
I think, at least--

640
00:29:11,290 --> 00:29:14,350
in Python or the language
you're using to program

641
00:29:14,350 --> 00:29:18,962
because you have the domain knowledge
to wield GPT in the generation of those.

642
00:29:18,962 --> 00:29:21,170
So let me show you an example
of what I mean by that.

643
00:29:21,170 --> 00:29:27,760
So this is a cool app that someone
created for the Writing Atlas Project.

644
00:29:27,760 --> 00:29:31,800
So Writing Atlas is a
set of short stories.

645
00:29:31,800 --> 00:29:35,160
And you can think of it as
Good Reads for short stories.

646
00:29:35,160 --> 00:29:37,843
So you can go in here, you
can browse different stories.

647
00:29:37,843 --> 00:29:40,260
And this was something somebody
created where you can type

648
00:29:40,260 --> 00:29:42,252
in a story description that you like.

649
00:29:42,252 --> 00:29:44,460
And this is going to take
about a minute to generate,

650
00:29:44,460 --> 00:29:46,252
so I'm going to talk
while it's generating.

651
00:29:46,252 --> 00:29:51,253
And while it's working,
what it's doing--

652
00:29:51,253 --> 00:29:52,920
and I'll show you the code in a second--

653
00:29:52,920 --> 00:29:56,273
is it's searching through the collection
of stories for similar stories.

654
00:29:56,273 --> 00:29:58,440
And here's where the domain
knowledge part comes in.

655
00:29:58,440 --> 00:30:02,370
Then it uses GPT to look at
what it was that you wanted

656
00:30:02,370 --> 00:30:06,120
and use knowledge of how an
editor, how a bookseller thinks

657
00:30:06,120 --> 00:30:09,060
to generate a set of
suggestions specifically

658
00:30:09,060 --> 00:30:12,240
through the lens of that
perspective, with the goal of writing

659
00:30:12,240 --> 00:30:16,200
that beautiful handwritten note that
we sometimes see in a local bookstore

660
00:30:16,200 --> 00:30:19,590
tacked on underneath a book.

661
00:30:19,590 --> 00:30:21,940
And so it doesn't just say,
"hey, you might like this,

662
00:30:21,940 --> 00:30:25,090
here's a general-purpose reason
why you might like this,"

663
00:30:25,090 --> 00:30:27,840
but specifically "here's
why you might like this,"

664
00:30:27,840 --> 00:30:29,550
with respect to what you gave it.

665
00:30:29,550 --> 00:30:32,670
It's either stalling out
or it's taking a long time.

666
00:30:32,670 --> 00:30:34,600
Oh, there we go.

667
00:30:34,600 --> 00:30:36,750
So here's its suggestions.

668
00:30:36,750 --> 00:30:40,140
And in particular,
these things, these are

669
00:30:40,140 --> 00:30:43,440
things that only a human
could know, at least for now.

670
00:30:43,440 --> 00:30:47,340
Two humans, specifically, the human
who said they wanted to read a story--

671
00:30:47,340 --> 00:30:49,620
that's the text that came
in-- and then the human

672
00:30:49,620 --> 00:30:54,030
who added domain knowledge to
script a sequence of interactions

673
00:30:54,030 --> 00:30:56,790
with the language model
so that you could provide

674
00:30:56,790 --> 00:30:59,550
very targeted reasoning
over something that

675
00:30:59,550 --> 00:31:01,510
was informed by that domain knowledge.

676
00:31:01,510 --> 00:31:05,460
So for these utility apps,
bring your domain knowledge.

677
00:31:05,460 --> 00:31:09,590


678
00:31:09,590 --> 00:31:12,110
Let me actually show you
how this looks in code

679
00:31:12,110 --> 00:31:15,740
because I think it's useful to see
how simple and accessible this is.

680
00:31:15,740 --> 00:31:18,320
This is really a set of prompts.

681
00:31:18,320 --> 00:31:22,017
So why might they like
a particular location?

682
00:31:22,017 --> 00:31:23,600
Well, here's the prompt that did that.

683
00:31:23,600 --> 00:31:25,780
This is an open source project.

684
00:31:25,780 --> 00:31:27,730
And it has a bunch of
examples, and then it

685
00:31:27,730 --> 00:31:31,050
says, well, here's the one
that we're interested in.

686
00:31:31,050 --> 00:31:31,980
Here's the audience.

687
00:31:31,980 --> 00:31:35,030
Here's a couple of examples of why
might people like a particular thing,

688
00:31:35,030 --> 00:31:36,013
in terms of audience.

689
00:31:36,013 --> 00:31:37,055
It's just another prompt.

690
00:31:37,055 --> 00:31:42,240


691
00:31:42,240 --> 00:31:43,380
Same for topic.

692
00:31:43,380 --> 00:31:44,620
Same for explanation.

693
00:31:44,620 --> 00:31:50,770
And if you go down here and look at how
it was done, suggesting the story is--

694
00:31:50,770 --> 00:31:53,920
what is this, line 174 to line 203--

695
00:31:53,920 --> 00:31:56,370
it really is-- and again,
over and over again,

696
00:31:56,370 --> 00:31:59,120
I want to impress upon you--
this really is within reach.

697
00:31:59,120 --> 00:32:04,360
It's really just, what, 20
odd lines of step one, search

698
00:32:04,360 --> 00:32:06,430
in the database for similar stories.

699
00:32:06,430 --> 00:32:11,020
Step two, given that I have
similar stories, pull out the data.

700
00:32:11,020 --> 00:32:16,780
Step three, with my domain knowledge
in Python, now run these prompts.

701
00:32:16,780 --> 00:32:19,100
Step four, prepare that into an output.

702
00:32:19,100 --> 00:32:24,550
So the thing we're scripting itself is
some approximation of human cognition,

703
00:32:24,550 --> 00:32:26,530
if you're willing to go
there metaphorically.

704
00:32:26,530 --> 00:32:28,690
We're not sure-- I'm
not going to weigh in

705
00:32:28,690 --> 00:32:36,100
on where we are in the is
OpenAI a life form argument.

706
00:32:36,100 --> 00:32:36,970
All right.

707
00:32:36,970 --> 00:32:40,825
One kind of really far out
there thing, and then I'll

708
00:32:40,825 --> 00:32:43,450
tie it up for questions, because
I know there's probably a lot.

709
00:32:43,450 --> 00:32:47,410
And I also want to make sure you
get great pizza in your bellies.

710
00:32:47,410 --> 00:32:52,383
And that is Baby AGI,
AutoGPT is what you might

711
00:32:52,383 --> 00:32:53,800
have heard them called on Twitter.

712
00:32:53,800 --> 00:32:56,050
I think of them as
multi-step planning bots.

713
00:32:56,050 --> 00:33:00,940
So everything I showed you so far was
approximately one-shot interactions

714
00:33:00,940 --> 00:33:02,510
with GPT.

715
00:33:02,510 --> 00:33:05,170
So this is the user says
they want something,

716
00:33:05,170 --> 00:33:10,330
and then either Python mediates
interactions with GPT or GPT

717
00:33:10,330 --> 00:33:13,600
itself does some things with
the inflection of a personality

718
00:33:13,600 --> 00:33:16,240
that you've added from
some prompt engineering.

719
00:33:16,240 --> 00:33:17,830
Really useful.

720
00:33:17,830 --> 00:33:19,150
Pretty easy to control.

721
00:33:19,150 --> 00:33:22,150
If you want to go to production, if
you want to build a weekend project,

722
00:33:22,150 --> 00:33:26,370
if you want to build a company,
that's a great way to do it right now.

723
00:33:26,370 --> 00:33:27,993
This is wild.

724
00:33:27,993 --> 00:33:29,910
And if you haven't seen
this stuff on Twitter,

725
00:33:29,910 --> 00:33:32,077
I would definitely recommend
going to search for it.

726
00:33:32,077 --> 00:33:33,720
This is what happens--

727
00:33:33,720 --> 00:33:37,740
the simple way to put it is--
if you put GPT in a for loop.

728
00:33:37,740 --> 00:33:42,040
If you let GPT talk to itself
and then tell itself what to do.

729
00:33:42,040 --> 00:33:46,530
So it's an emergent behavior.

730
00:33:46,530 --> 00:33:49,660
And like all emergent behaviors,
it starts with a few simple steps.

731
00:33:49,660 --> 00:33:53,490
In Conway's Game of Life,
many elements of reality

732
00:33:53,490 --> 00:33:56,340
turn out to be math equations
that fit on a t-shirt,

733
00:33:56,340 --> 00:33:59,280
but then when you play them
forward in time, they generate DNA.

734
00:33:59,280 --> 00:34:00,910
They generate human life.

735
00:34:00,910 --> 00:34:07,180
So this is approximately, step
one, take a human objective.

736
00:34:07,180 --> 00:34:11,199
Step two, your first task is to
write yourself a list of steps.

737
00:34:11,199 --> 00:34:12,639
And here's the critical part--

738
00:34:12,639 --> 00:34:14,159
repeat.

739
00:34:14,159 --> 00:34:16,000
Now do the list of steps.

740
00:34:16,000 --> 00:34:20,080
Now, you have to embody your agent
with the ability to do things.

741
00:34:20,080 --> 00:34:22,949
So it's really only limited to do
what you give it the tools to do

742
00:34:22,949 --> 00:34:24,540
and what it has the skills to do.

743
00:34:24,540 --> 00:34:27,570
So obviously, this is
still very much a set

744
00:34:27,570 --> 00:34:29,505
of experiments that
are running right now.

745
00:34:29,505 --> 00:34:32,130
But it's something that we'll
see unfold over the coming years.

746
00:34:32,130 --> 00:34:34,170
And this is the scenario
in which Python stops

747
00:34:34,170 --> 00:34:37,139
becoming so important because we've
given it the ability to actually

748
00:34:37,139 --> 00:34:39,210
self-direct what it's doing.

749
00:34:39,210 --> 00:34:41,094
And then it finally gives you a result.

750
00:34:41,094 --> 00:34:44,219
And I want to give you an example still
of just, again, impressing upon you

751
00:34:44,219 --> 00:34:48,090
how much of this is prompt engineering,
which is wild, how little code this is.

752
00:34:48,090 --> 00:34:53,670
Let me show you what
Baby AGI looks like.

753
00:34:53,670 --> 00:34:57,560
So here is a Baby AGI that
you can connect to Telegram.

754
00:34:57,560 --> 00:35:01,150


755
00:35:01,150 --> 00:35:03,512
And this is an agent that has two tools.

756
00:35:03,512 --> 00:35:05,470
So I haven't explained
to you what an agent is.

757
00:35:05,470 --> 00:35:07,150
I haven't explained
to you what tools are.

758
00:35:07,150 --> 00:35:09,150
I'll give you a quick,
one-sentence description.

759
00:35:09,150 --> 00:35:14,770
An agent is just a word to mean GPT plus
some bigger body in which it's living.

760
00:35:14,770 --> 00:35:16,240
Maybe that body has a personality.

761
00:35:16,240 --> 00:35:17,080
Maybe it has tools.

762
00:35:17,080 --> 00:35:19,990
Maybe it has Python mediating
its experience with other things.

763
00:35:19,990 --> 00:35:23,980
Tools are simply ways in which
the agent can choose to do things.

764
00:35:23,980 --> 00:35:26,440
Like, imagine if GPT could
say, "order a pizza."

765
00:35:26,440 --> 00:35:28,780
And instead of you seeing
the text "order a pizza,"

766
00:35:28,780 --> 00:35:30,250
that caused a pizza to be ordered.

767
00:35:30,250 --> 00:35:32,080
That's a tool.

768
00:35:32,080 --> 00:35:33,330
So these are two tools it has.

769
00:35:33,330 --> 00:35:35,190
One tool is generate a to-do list.

770
00:35:35,190 --> 00:35:38,025
One tool is do a search on the web.

771
00:35:38,025 --> 00:35:42,240


772
00:35:42,240 --> 00:35:46,570
And then down here, it
has a prompt saying, "hey,

773
00:35:46,570 --> 00:35:50,010
your goal is to build a task
list and then do that task list."

774
00:35:50,010 --> 00:35:53,300
And then this is just placed into a
harness that does it over and over

775
00:35:53,300 --> 00:35:53,800
again.

776
00:35:53,800 --> 00:35:56,940
So after the next task, kind of
on cue, the results of that task.

777
00:35:56,940 --> 00:35:58,810
And keep it going.

778
00:35:58,810 --> 00:36:01,830
And so in doing that, you
get this kickstarted loop,

779
00:36:01,830 --> 00:36:05,790
where, essentially, you kickstart it
and then the agent is talking to itself.

780
00:36:05,790 --> 00:36:07,270
Talking to itself.

781
00:36:07,270 --> 00:36:10,948
So this, unless I'm wrong, I don't
think this has yet reached production,

782
00:36:10,948 --> 00:36:12,990
in terms of what we're
seeing in the field of how

783
00:36:12,990 --> 00:36:14,440
people are deploying software.

784
00:36:14,440 --> 00:36:17,885
But if you want to dive into the
wildest part of experimentation,

785
00:36:17,885 --> 00:36:20,010
this is definitely one of
the places you can start.

786
00:36:20,010 --> 00:36:21,810
And it's really within reach.

787
00:36:21,810 --> 00:36:25,462
All you have to do is download one
of the starter projects for it.

788
00:36:25,462 --> 00:36:27,420
And you can kind of see
right in the prompting,

789
00:36:27,420 --> 00:36:31,180
here's how you kickstart
that process of iteration.

790
00:36:31,180 --> 00:36:37,880


791
00:36:37,880 --> 00:36:40,310
All right, so I know that
was super high-level.

792
00:36:40,310 --> 00:36:41,557
I hope it was useful.

793
00:36:41,557 --> 00:36:44,390
It's, I think, from the field, from
the bottoms up what we're seeing

794
00:36:44,390 --> 00:36:48,740
and what people are building, kind of
these high-level categories of apps

795
00:36:48,740 --> 00:36:51,050
that people are making,
all of these apps

796
00:36:51,050 --> 00:36:52,980
are apps that are within
reach to everybody,

797
00:36:52,980 --> 00:36:54,770
which is really, really exciting.

798
00:36:54,770 --> 00:36:59,750
And I suggest Twitter as a great
place to hang out and build things.

799
00:36:59,750 --> 00:37:02,700
There's a lot of AI builders
on Twitter publishing.

800
00:37:02,700 --> 00:37:05,810
And I think we've got a couple of
minutes before pizza is arriving.

801
00:37:05,810 --> 00:37:06,660
Maybe 10 minutes.

802
00:37:06,660 --> 00:37:07,380
Keep on going?

803
00:37:07,380 --> 00:37:07,880
Oh.

804
00:37:07,880 --> 00:37:11,210
So if there's any questions, why don't
we kick it to that, because I'm sure

805
00:37:11,210 --> 00:37:13,740
there's some questions
that you all have.

806
00:37:13,740 --> 00:37:14,970
I guess ended a little early.

807
00:37:14,970 --> 00:37:15,830
Yes.

808
00:37:15,830 --> 00:37:18,740
AUDIENCE: Yeah, so I have a
question about hallucinations.

809
00:37:18,740 --> 00:37:22,800
And so when you're building these
sorts of applications in the apps,

810
00:37:22,800 --> 00:37:24,420
for example, let's say--

811
00:37:24,420 --> 00:37:27,200
I'm giving you, like, a physics
problem, from a [? pset, ?]

812
00:37:27,200 --> 00:37:28,380
and we want to do that.

813
00:37:28,380 --> 00:37:29,180
TED BENSON: Yeah.

814
00:37:29,180 --> 00:37:32,702
AUDIENCE: And it's, 40%
of the time, just wrong.

815
00:37:32,702 --> 00:37:33,410
TED BENSON: Yeah.

816
00:37:33,410 --> 00:37:35,600
AUDIENCE: Do you have any
actionable recommendations

817
00:37:35,600 --> 00:37:38,750
that developers should be doing
to make it hallucinate any less?

818
00:37:38,750 --> 00:37:41,870
Or maybe even things that
OpenAI on the back end

819
00:37:41,870 --> 00:37:43,850
should be doing to
reduce hallucinations?

820
00:37:43,850 --> 00:37:47,670
Would it be something
where you use RLHF?

821
00:37:47,670 --> 00:37:49,170
Any thoughts there?

822
00:37:49,170 --> 00:37:52,310
TED BENSON: So the question was
how-- approximately, how do you

823
00:37:52,310 --> 00:37:53,810
manage the hallucination problem?

824
00:37:53,810 --> 00:37:58,830
If you give it a physics lecture, and
you ask it a question, on the one hand,

825
00:37:58,830 --> 00:38:00,680
it appears to be
answering you correctly.

826
00:38:00,680 --> 00:38:05,060
On the other hand, it appears to
be wrong to an expert's eye 40%

827
00:38:05,060 --> 00:38:07,280
of the time, 70% of the
time, 10% of the time.

828
00:38:07,280 --> 00:38:08,490
It's a huge problem.

829
00:38:08,490 --> 00:38:10,580
And then what are some
ways as developers,

830
00:38:10,580 --> 00:38:13,160
practically, you can
use to mitigate that.

831
00:38:13,160 --> 00:38:14,040
I'll give an answer.

832
00:38:14,040 --> 00:38:15,832
Sil, you may have some
specific things too.

833
00:38:15,832 --> 00:38:17,582
So one high-level
answer is the same thing

834
00:38:17,582 --> 00:38:19,540
that makes these things
capable of synthesizing

835
00:38:19,540 --> 00:38:22,130
information is part of the reason
why it hallucinates for you.

836
00:38:22,130 --> 00:38:25,200
So it's hard to have your cake and
eat it too to a certain extent.

837
00:38:25,200 --> 00:38:26,700
So this is part of the game.

838
00:38:26,700 --> 00:38:28,340
In fact, humans do it too.

839
00:38:28,340 --> 00:38:32,942
People talk about just folks who are
too aggressive in their assumptions

840
00:38:32,942 --> 00:38:33,650
about knowledge--

841
00:38:33,650 --> 00:38:35,630
I can't remember the name
for that phenomenon--

842
00:38:35,630 --> 00:38:36,797
where you'll just say stuff.

843
00:38:36,797 --> 00:38:38,480
So we do it to.

844
00:38:38,480 --> 00:38:40,762
Some things you can do are--

845
00:38:40,762 --> 00:38:43,220
kind of a range of activities--
depending on how much money

846
00:38:43,220 --> 00:38:45,845
you're willing to spend, how much
technical expertise you have,

847
00:38:45,845 --> 00:38:48,620
it can range from fine tuning
a model, to practically--

848
00:38:48,620 --> 00:38:51,710
I'm in the applied world, so I'm
very much in a world of duct tape

849
00:38:51,710 --> 00:38:53,132
and how developers get stuff done.

850
00:38:53,132 --> 00:38:54,090
So some of the answers.

851
00:38:54,090 --> 00:38:56,173
I'll give you are sort of
very duct tapey answers.

852
00:38:56,173 --> 00:38:59,060
Giving it examples tends
to work for acute things.

853
00:38:59,060 --> 00:39:03,090
If it's behaving in wild ways, the
more examples you give it, the better.

854
00:39:03,090 --> 00:39:05,635
That's not going to solve
the domain of all of physics.

855
00:39:05,635 --> 00:39:07,760
So for the domain of all
of physics, I'm going to--

856
00:39:07,760 --> 00:39:09,140
I'm going to bail and
give it to you because I

857
00:39:09,140 --> 00:39:11,515
think you are far more equipped
than me to speak on that.

858
00:39:11,515 --> 00:39:14,730
SIL HAMILTON: Sure, so the model
doesn't have a ground truth.

859
00:39:14,730 --> 00:39:16,010
It doesn't know anything.

860
00:39:16,010 --> 00:39:19,010
Any sense of meaning that is
derived from the training process

861
00:39:19,010 --> 00:39:21,920
is purely out of differentiation.

862
00:39:21,920 --> 00:39:23,430
One word is not another word.

863
00:39:23,430 --> 00:39:27,905
Words are not used in the same
context it understands everything

864
00:39:27,905 --> 00:39:29,780
only through examples
given through language.

865
00:39:29,780 --> 00:39:32,570
It's like someone who learned
English or how to speak,

866
00:39:32,570 --> 00:39:34,670
but they grew up in a
featureless, gray room.

867
00:39:34,670 --> 00:39:36,360
They've never seen the outside world.

868
00:39:36,360 --> 00:39:39,152
They have nothing to rest on that
tells them that something is true

869
00:39:39,152 --> 00:39:40,800
and something is not true.

870
00:39:40,800 --> 00:39:43,850
So from the model's perspective,
everything that it says is true.

871
00:39:43,850 --> 00:39:46,460
It's trying its best to give
you the best answer possible.

872
00:39:46,460 --> 00:39:50,810
And if lying a little bit or
conflating two different topics

873
00:39:50,810 --> 00:39:53,587
is the best way to achieve that,
then it will decide to do so.

874
00:39:53,587 --> 00:39:54,920
It's a part of the architecture.

875
00:39:54,920 --> 00:39:56,120
We can't get around it.

876
00:39:56,120 --> 00:39:59,690
There are a number of cheap
tricks that surprisingly get

877
00:39:59,690 --> 00:40:01,868
it to confabulate or hallucinate less.

878
00:40:01,868 --> 00:40:04,910
One of them includes-- recently, there
was a paper that's a little funny.

879
00:40:04,910 --> 00:40:09,800
If you get it to prepend to
its answer, "My best guess is,"

880
00:40:09,800 --> 00:40:14,400
that will actually improve or
reduce hallucinations by about 80%.

881
00:40:14,400 --> 00:40:16,730
So clearly, it has some sense
that some things are true

882
00:40:16,730 --> 00:40:19,397
and other things are not, but
we're not quite sure what that is.

883
00:40:19,397 --> 00:40:22,070
To add on to what Ted was saying,
a few cheap things you can do,

884
00:40:22,070 --> 00:40:26,060
include letting it Google, or
Bing-- as in Bing Chat, what they're

885
00:40:26,060 --> 00:40:28,250
doing-- it cites this information.

886
00:40:28,250 --> 00:40:31,430
Asking it to make sure
its own response is good.

887
00:40:31,430 --> 00:40:34,700
If you've ever had ChatGPT
generate a program--

888
00:40:34,700 --> 00:40:37,130
there's some kind of problem,
and you ask ChatGPT--

889
00:40:37,130 --> 00:40:38,390
I think there's a mistake.

890
00:40:38,390 --> 00:40:40,820
Often, it'll locate the mistake itself.

891
00:40:40,820 --> 00:40:43,478
Why it didn't produce the right
answer at the very beginning,

892
00:40:43,478 --> 00:40:45,770
we're still not sure, but
we're moving in the direction

893
00:40:45,770 --> 00:40:46,895
of reducing hallucinations.

894
00:40:46,895 --> 00:40:49,130
Now with respect to
physics, you're going

895
00:40:49,130 --> 00:40:53,810
to have to give it an external
database to rest on because internally

896
00:40:53,810 --> 00:40:56,870
for really, really
domain-specific knowledge,

897
00:40:56,870 --> 00:41:02,330
it's not going to be as
deterministic as one would like.

898
00:41:02,330 --> 00:41:04,130
These things work in continuous spaces.

899
00:41:04,130 --> 00:41:07,650
These things, they don't know
what is wrong, what is true.

900
00:41:07,650 --> 00:41:10,310
And as a result, we
have to give it tools.

901
00:41:10,310 --> 00:41:14,427
So everything that Ted
demoed today is really

902
00:41:14,427 --> 00:41:17,260
striving at reducing hallucinations,
actually, really, and giving it

903
00:41:17,260 --> 00:41:18,160
more abilities.

904
00:41:18,160 --> 00:41:20,415
I hope that answers your question.

905
00:41:20,415 --> 00:41:21,790
TED BENSON: One of the ways too--

906
00:41:21,790 --> 00:41:22,960
I'm a simple guy.

907
00:41:22,960 --> 00:41:26,680
I tend to think that all of the
world tends to be just a few things

908
00:41:26,680 --> 00:41:28,000
repeated over and over again.

909
00:41:28,000 --> 00:41:29,710
And we have human systems for this.

910
00:41:29,710 --> 00:41:33,370
In a team, like companies
work-- or a team playing sport,

911
00:41:33,370 --> 00:41:36,020
and we're not right all the
time, even when we aspire to be.

912
00:41:36,020 --> 00:41:38,980
And so we have systems that
we've developed as humans

913
00:41:38,980 --> 00:41:41,050
to deal with things that may be wrong.

914
00:41:41,050 --> 00:41:43,810
So human number one proposes an answer.

915
00:41:43,810 --> 00:41:45,700
Human number two checks their work.

916
00:41:45,700 --> 00:41:48,430
Human number three provides
the final sign off.

917
00:41:48,430 --> 00:41:49,360
This is really common.

918
00:41:49,360 --> 00:41:51,860
Anybody who's worked in a company
has seen this in practice.

919
00:41:51,860 --> 00:41:55,210
The interesting thing about the
state of software right now,

920
00:41:55,210 --> 00:41:57,520
we tend to be in this
mode in which we're just

921
00:41:57,520 --> 00:41:59,860
talking to GPT as one entity.

922
00:41:59,860 --> 00:42:03,470
But once we start thinking in
terms of teams, so to speak,

923
00:42:03,470 --> 00:42:06,760
where each team member is its own
agent with its own set of objectives

924
00:42:06,760 --> 00:42:09,430
and skills, I suspect
we're going to start

925
00:42:09,430 --> 00:42:11,860
seeing a programming model in
which the way to solve this

926
00:42:11,860 --> 00:42:17,380
might not necessarily be make a single
brain smarter, but instead be draw

927
00:42:17,380 --> 00:42:20,770
upon the collective intelligence
of multiple software agents,

928
00:42:20,770 --> 00:42:21,980
each playing a role.

929
00:42:21,980 --> 00:42:25,750
And I think that that would certainly
follow the human pattern of how

930
00:42:25,750 --> 00:42:26,530
we deal with this.

931
00:42:26,530 --> 00:42:29,140
SIL HAMILTON: To give it
an analogy, space shuttles,

932
00:42:29,140 --> 00:42:33,087
things that go into space,
spacecraft, they have to be good.

933
00:42:33,087 --> 00:42:34,420
If they're not good, people die.

934
00:42:34,420 --> 00:42:38,000
They have no margin for error at all.

935
00:42:38,000 --> 00:42:40,480
And as a result, we
overengineer in those systems.

936
00:42:40,480 --> 00:42:42,640
Most spacecraft have three computers.

937
00:42:42,640 --> 00:42:46,690
And they all have to agree in unison
on a particular step to go forward.

938
00:42:46,690 --> 00:42:49,910
If one does not agree, then they
recalculate, they recalculate,

939
00:42:49,910 --> 00:42:51,910
they recalculate until
they arrive at something.

940
00:42:51,910 --> 00:42:54,850
But the good thing is that
hallucinations are generally not

941
00:42:54,850 --> 00:42:57,220
a systemic problem in
terms of its knowledge.

942
00:42:57,220 --> 00:42:58,720
It's often a one-off.

943
00:42:58,720 --> 00:43:02,025
The model or something tripped it up,
and it just produced a hallucination

944
00:43:02,025 --> 00:43:02,900
in that one instance.

945
00:43:02,900 --> 00:43:06,070
So if there's three models working
in unison, just as Ted is saying,

946
00:43:06,070 --> 00:43:10,850
that will, generally speaking,
improve your success.

947
00:43:10,850 --> 00:43:11,665
Yes, sir.

948
00:43:11,665 --> 00:43:14,290
AUDIENCE: A number of the examples
you show how assertions like

949
00:43:14,290 --> 00:43:16,450
you are an engineer, you aren't AI.

950
00:43:16,450 --> 00:43:17,242
You are a teacher.

951
00:43:17,242 --> 00:43:17,950
TED BENSON: Yeah.

952
00:43:17,950 --> 00:43:20,800
AUDIENCE: What's the mechanism
by which that influences

953
00:43:20,800 --> 00:43:22,952
this computation of probabilities?

954
00:43:22,952 --> 00:43:23,660
TED BENSON: Sure.

955
00:43:23,660 --> 00:43:26,830
I'm going to give you what might
be an unsatisfying answer, which

956
00:43:26,830 --> 00:43:28,060
is it tends to work.

957
00:43:28,060 --> 00:43:30,160
But I think we know
why it tends to work.

958
00:43:30,160 --> 00:43:32,170
And again, it's because
these language models

959
00:43:32,170 --> 00:43:34,100
approximate how we talk to each other.

960
00:43:34,100 --> 00:43:36,400
So if I were to say to
you-- hey, help me out.

961
00:43:36,400 --> 00:43:38,865
I need you to mock interview me.

962
00:43:38,865 --> 00:43:40,990
That's a direct statement
I can make that kicks you

963
00:43:40,990 --> 00:43:42,520
into a certain mode of interaction.

964
00:43:42,520 --> 00:43:44,830
Or if I say to you-- help me out.

965
00:43:44,830 --> 00:43:46,720
I'm trying to apologize to my wife.

966
00:43:46,720 --> 00:43:47,800
She's really mad at me.

967
00:43:47,800 --> 00:43:49,100
Can you role play with me?

968
00:43:49,100 --> 00:43:51,100
That kicks you into another
mode of interaction.

969
00:43:51,100 --> 00:43:53,590
And so it's really just
a shorthand that people

970
00:43:53,590 --> 00:43:55,990
have found to kick
the agent in-- to kick

971
00:43:55,990 --> 00:43:58,600
the LLM in to a certain
mode of interaction that

972
00:43:58,600 --> 00:44:01,450
tends to work in the way that
I, as a software developer,

973
00:44:01,450 --> 00:44:03,510
am hoping it would work.

974
00:44:03,510 --> 00:44:06,270
SIL HAMILTON: And to really
quickly add on to that.

975
00:44:06,270 --> 00:44:08,958
Being in the digital
humanities that I am,

976
00:44:08,958 --> 00:44:10,500
I like to think of it as a narrative.

977
00:44:10,500 --> 00:44:13,470
A narrative will have a few different
characters talking to each other.

978
00:44:13,470 --> 00:44:15,150
Their roles are clearly defined.

979
00:44:15,150 --> 00:44:17,620
Two people are not the same.

980
00:44:17,620 --> 00:44:20,340
This interaction with GPT,
it assumes a personality.

981
00:44:20,340 --> 00:44:21,960
It can simulate personalities.

982
00:44:21,960 --> 00:44:26,080
It, itself, is not conscious in
any way, but it can certainly

983
00:44:26,080 --> 00:44:29,660
predict what a conscious being would
react like in a particular situation.

984
00:44:29,660 --> 00:44:34,150
So when we're going URX, it
is drawing up that personality

985
00:44:34,150 --> 00:44:35,920
and talking as though it is that person.

986
00:44:35,920 --> 00:44:38,260
Because it is like
completing a transcript

987
00:44:38,260 --> 00:44:42,610
or completing a story in which that
character is present, and interacting,

988
00:44:42,610 --> 00:44:44,530
and is active.

989
00:44:44,530 --> 00:44:45,473
So, yeah.

990
00:44:45,473 --> 00:44:48,390
TED BENSON: I think we got about
five minutes until the pizza outside.

991
00:44:48,390 --> 00:44:49,080
SPEAKER 1: Eight minutes.

992
00:44:49,080 --> 00:44:50,163
TED BENSON: Eight minutes.

993
00:44:50,163 --> 00:44:53,080


994
00:44:53,080 --> 00:44:55,150
Yes, sir.

995
00:44:55,150 --> 00:44:59,680
AUDIENCE: So I'm not a VS person,
but it's been fun playing with this.

996
00:44:59,680 --> 00:45:04,370
And I understand the word by
word generation and the vibe.

997
00:45:04,370 --> 00:45:07,385
The feeling of it in the narrative.

998
00:45:07,385 --> 00:45:09,260
Some of my friends and
I have tried giving it

999
00:45:09,260 --> 00:45:15,040
logic problems, like things from the
LSAT, for example, and it doesn't work.

1000
00:45:15,040 --> 00:45:16,870
And I'm just wondering
why that would be.

1001
00:45:16,870 --> 00:45:20,710
So it will generate answers
that sound very plausible

1002
00:45:20,710 --> 00:45:24,340
rhetorically-- like given this
condition, x given this will be y,

1003
00:45:24,340 --> 00:45:28,420
but it will often even
contradict itself in its answers.

1004
00:45:28,420 --> 00:45:30,890
But it's almost never correct.

1005
00:45:30,890 --> 00:45:33,850
So I was wondering why that would be?

1006
00:45:33,850 --> 00:45:35,820
Like, it just can't reason?

1007
00:45:35,820 --> 00:45:37,820
It can't think?

1008
00:45:37,820 --> 00:45:41,448
And can you-- would we get to a
place where it can, so to speak?

1009
00:45:41,448 --> 00:45:43,990
I mean not-- you know what I
mean, I don't mean to think like

1010
00:45:43,990 --> 00:45:46,160
it's conscious, I mean have thoughts--

1011
00:45:46,160 --> 00:45:46,660
[INAUDIBLE]?

1012
00:45:46,660 --> 00:45:47,440
TED BENSON: You want to about react?

1013
00:45:47,440 --> 00:45:49,273
AUDIENCE: I don't know
how else to say that.

1014
00:45:49,273 --> 00:45:50,500
TED BENSON: So GPT4--

1015
00:45:50,500 --> 00:45:55,300
when GPT4 released back in March, I
think it was, it was passing LSAT.

1016
00:45:55,300 --> 00:45:56,020
AUDIENCE: It was?

1017
00:45:56,020 --> 00:45:56,860
TED BENSON: It was, yeah.

1018
00:45:56,860 --> 00:45:57,310
AUDIENCE: [INAUDIBLE]

1019
00:45:57,310 --> 00:45:57,730
TED BENSON: Yes.

1020
00:45:57,730 --> 00:45:58,330
AUDIENCE: [INAUDIBLE]

1021
00:45:58,330 --> 00:46:00,370
TED BENSON: Yes, it just
passed, as I understand it.

1022
00:46:00,370 --> 00:46:02,570
AUDIENCE: Maybe it's because
we're not using [INAUDIBLE]..

1023
00:46:02,570 --> 00:46:04,300
TED BENSON: That's one of
the weird things, is that--

1024
00:46:04,300 --> 00:46:04,990
AUDIENCE: ChatGPT.

1025
00:46:04,990 --> 00:46:05,740
TED BENSON: Yeah--

1026
00:46:05,740 --> 00:46:06,730
AUDIENCE: [INAUDIBLE]

1027
00:46:06,730 --> 00:46:09,938
TED BENSON: If you pay for ChatGPT, they
give you access to the better model.

1028
00:46:09,938 --> 00:46:13,660
And one of the interesting
things with it is prompting.

1029
00:46:13,660 --> 00:46:15,340
It's so finicky.

1030
00:46:15,340 --> 00:46:18,310
It's very sensitive to
the way that you prompt.

1031
00:46:18,310 --> 00:46:21,910
Earlier on when, when GPT3 came
out, some people were going, look,

1032
00:46:21,910 --> 00:46:25,210
it can pass literacy tests, or
no, it can't pass literacy tests.

1033
00:46:25,210 --> 00:46:28,060
And then people who were pro
or anti GPT would be like,

1034
00:46:28,060 --> 00:46:31,300
I modified the prompt a little bit,
suddenly it can or suddenly it can't.

1035
00:46:31,300 --> 00:46:33,430
These things are not conscious.

1036
00:46:33,430 --> 00:46:35,950
Their ability to reason
is like an alien's.

1037
00:46:35,950 --> 00:46:36,602
They're not us.

1038
00:46:36,602 --> 00:46:37,810
They don't think like people.

1039
00:46:37,810 --> 00:46:38,800
They're not human.

1040
00:46:38,800 --> 00:46:43,210
But they certainly are capable of
passing some things empirically, which

1041
00:46:43,210 --> 00:46:46,540
demonstrates some sort of rationale
or logic within the model.

1042
00:46:46,540 --> 00:46:49,750
But we're still slowly figuring
out, like a prompt whisperer,

1043
00:46:49,750 --> 00:46:51,340
what exactly the right approach is.

1044
00:46:51,340 --> 00:46:56,170


1045
00:46:56,170 --> 00:47:01,900
AUDIENCE: Obviously, having GPT
running and prompting it continuously

1046
00:47:01,900 --> 00:47:04,670
is very expensive in terms of GPU.

1047
00:47:04,670 --> 00:47:10,726
How do you see instances where it
pre-creates some sort of business value

1048
00:47:10,726 --> 00:47:12,470
in a startup or a company?

1049
00:47:12,470 --> 00:47:18,184
Was there a real added value
having these little AI apps

1050
00:47:18,184 --> 00:47:22,540
in terms of [INAUDIBLE]?

1051
00:47:22,540 --> 00:47:24,960
TED BENSON: Yeah, we host
companies on top of us,

1052
00:47:24,960 --> 00:47:27,400
who that's their primary product.

1053
00:47:27,400 --> 00:47:31,480
The value that it adds
is, like any company--

1054
00:47:31,480 --> 00:47:33,422
I mean, it's what is
the Y Combinator motto--

1055
00:47:33,422 --> 00:47:34,630
"Make something people want."

1056
00:47:34,630 --> 00:47:39,430
I mean, I wouldn't think of this as
GPT inherently provides value for you

1057
00:47:39,430 --> 00:47:40,205
as a builder.

1058
00:47:40,205 --> 00:47:41,080
That's their product.

1059
00:47:41,080 --> 00:47:42,160
That's OpenAI's product.

1060
00:47:42,160 --> 00:47:44,920
You pay ChatGPT for prioritized access.

1061
00:47:44,920 --> 00:47:48,670
Where your product might be is
how you take that and combine it

1062
00:47:48,670 --> 00:47:53,410
with your data, somebody else's data,
some domain knowledge, some interface

1063
00:47:53,410 --> 00:47:56,740
that then helps apply it to something.

1064
00:47:56,740 --> 00:47:58,100
Two things are both true.

1065
00:47:58,100 --> 00:48:00,910
There are a lot of experiments
going on right now,

1066
00:48:00,910 --> 00:48:05,500
both for fun and people trying to
figure out where the economic value is.

1067
00:48:05,500 --> 00:48:08,440
But folks are also spinning up
companies that are 100% supported

1068
00:48:08,440 --> 00:48:10,063
by applying this to data.

1069
00:48:10,063 --> 00:48:10,605
AUDIENCE: OK.

1070
00:48:10,605 --> 00:48:14,150
For a company that wouldn't have--

1071
00:48:14,150 --> 00:48:19,450
wouldn't be AI focused [INAUDIBLE],,
just using or developing

1072
00:48:19,450 --> 00:48:24,770
in-house apps that use
GPT for productivity.

1073
00:48:24,770 --> 00:48:29,990
TED BENSON: I think that it is
likely that today we call this GPT,

1074
00:48:29,990 --> 00:48:32,000
and today we call these
LLMs, and tomorrow it

1075
00:48:32,000 --> 00:48:33,485
will just slide into the ether.

1076
00:48:33,485 --> 00:48:36,110
Imagine what the-- imagine what
the progression is going to be.

1077
00:48:36,110 --> 00:48:39,250
Today, there's one of these that
people are primarily playing with.

1078
00:48:39,250 --> 00:48:42,500
There's many of them that exist, but one
people are primarily building on top.

1079
00:48:42,500 --> 00:48:45,440
Tomorrow, we can expect that
there will be many of them.

1080
00:48:45,440 --> 00:48:48,440
And the day after that, we can expect
they're going to be on our phones,

1081
00:48:48,440 --> 00:48:50,898
and they're not even going to
be connected to the internet.

1082
00:48:50,898 --> 00:48:54,380
And for that reason, I
think that-- like today,

1083
00:48:54,380 --> 00:48:57,620
we don't call our software
microprocessor tools or microprocessor

1084
00:48:57,620 --> 00:48:59,960
apps, like the processor just exists--

1085
00:48:59,960 --> 00:49:03,720
I think that one useful
model, five years out,

1086
00:49:03,720 --> 00:49:07,730
10 years out is to-- even if it's only
metaphorically true and not literally

1087
00:49:07,730 --> 00:49:08,300
true--

1088
00:49:08,300 --> 00:49:12,110
I think it's useful to think
if this as a second processor.

1089
00:49:12,110 --> 00:49:15,710
We had this before with floating
point co-processors and graphics

1090
00:49:15,710 --> 00:49:19,400
co-processors already, as
recently as the '90s, where

1091
00:49:19,400 --> 00:49:22,640
it's useful to think of the trajectory
of this as just another thing

1092
00:49:22,640 --> 00:49:25,670
that computers do, can do,
and it will be incorporated

1093
00:49:25,670 --> 00:49:27,050
into absolutely everything.

1094
00:49:27,050 --> 00:49:31,310
SIL HAMILTON: Hence, the term
foundation model, which also crops up.

1095
00:49:31,310 --> 00:49:33,890
So the pizza's ready?

1096
00:49:33,890 --> 00:49:34,937
One more question.

1097
00:49:34,937 --> 00:49:37,520
TED BENSON: Maybe one more and
then we'll break for some food.

1098
00:49:37,520 --> 00:49:40,530


1099
00:49:40,530 --> 00:49:41,820
In the glasses right there.

1100
00:49:41,820 --> 00:49:46,550
AUDIENCE: Do you have
recommendations for the--

1101
00:49:46,550 --> 00:49:49,968
TED BENSON: Sorry, I was just
being told we need to get two more.

1102
00:49:49,968 --> 00:49:52,010
AUDIENCE: Do you have any
recommendations for how

1103
00:49:52,010 --> 00:49:56,418
ChatGPT will [INAUDIBLE] structure
of data, like JSON, for example,

1104
00:49:56,418 --> 00:49:57,903
[INAUDIBLE].

1105
00:49:57,903 --> 00:50:00,070
TED BENSON: It's hard to
get it to do that reliably.

1106
00:50:00,070 --> 00:50:02,380
It's incredibly useful
to get it to do reliably.

1107
00:50:02,380 --> 00:50:05,680
So some tricks you can use
are you can give it examples.

1108
00:50:05,680 --> 00:50:08,470
You can just ask it directly.

1109
00:50:08,470 --> 00:50:10,630
Those are two common tricks.

1110
00:50:10,630 --> 00:50:13,390
And look at the prompts that
others have used to work.

1111
00:50:13,390 --> 00:50:16,540
There's a lot of art to finding
the right prompt right now.

1112
00:50:16,540 --> 00:50:19,210
A lot of it is magic incantation.

1113
00:50:19,210 --> 00:50:23,800
Another thing you can do is post process
it so that you can do some checking,

1114
00:50:23,800 --> 00:50:26,233
and you can have a happy path,
in which it's a one shot,

1115
00:50:26,233 --> 00:50:28,900
and you get your answer, and then
a sad path, in which maybe you

1116
00:50:28,900 --> 00:50:30,310
fall back on other prompts.

1117
00:50:30,310 --> 00:50:32,440
So then you're going for
the diversity of approach,

1118
00:50:32,440 --> 00:50:36,160
where it's fast by default,
it's slow, but ultimately

1119
00:50:36,160 --> 00:50:39,140
converging upon higher likelihood
of success if it fails.

1120
00:50:39,140 --> 00:50:42,370
And then something that I'm sure
we'll see and people do later on

1121
00:50:42,370 --> 00:50:45,310
is fine tune-- instruction
tuning style models,

1122
00:50:45,310 --> 00:50:49,920
which are more likely to respond
with a computer parsable output.

1123
00:50:49,920 --> 00:50:51,340
I guess one last question.

1124
00:50:51,340 --> 00:50:54,620
AUDIENCE: Sure, so one, you
talked-- a couple of things.

1125
00:50:54,620 --> 00:50:58,176
One is this you talk about
domain expertise here.

1126
00:50:58,176 --> 00:51:01,605
And you're encoding a bunch of domain
expertise in terms of the prompts

1127
00:51:01,605 --> 00:51:02,940
that you're loading there.

1128
00:51:02,940 --> 00:51:05,110
What is that-- where do
those prompts end up?

1129
00:51:05,110 --> 00:51:08,480
Do those prompts end up
back in the ChatGPT model?

1130
00:51:08,480 --> 00:51:11,050
And is there a privacy
issue associated with that?

1131
00:51:11,050 --> 00:51:12,550
TED BENSON: That's a great question.

1132
00:51:12,550 --> 00:51:13,860
So the question was--
and I apologize, I just

1133
00:51:13,860 --> 00:51:16,350
realized we haven't been repeating
all the questions for the YouTube

1134
00:51:16,350 --> 00:51:18,090
listeners, so I'm sorry
for the folks on YouTube

1135
00:51:18,090 --> 00:51:20,230
if you weren't able to
hear some of the questions.

1136
00:51:20,230 --> 00:51:23,438
The question was, what are the privacy
implications of some of these prompts?

1137
00:51:23,438 --> 00:51:26,040
If one of the messages is so
much depends upon your prompt

1138
00:51:26,040 --> 00:51:29,760
and the fine tuning of this prompt, what
does that mean with respect to my IP?

1139
00:51:29,760 --> 00:51:32,190
Maybe the prompt is my business.

1140
00:51:32,190 --> 00:51:34,410
I can't offer you the
exact answer, but I

1141
00:51:34,410 --> 00:51:37,510
can paint for you what approximately
the landscape looks like.

1142
00:51:37,510 --> 00:51:41,440
So in all of software, and so
too with AI, what we see is there

1143
00:51:41,440 --> 00:51:44,850
are the SAS companies, where
you're using somebody else's API.

1144
00:51:44,850 --> 00:51:48,270
And you're trusting that their
terms of service will be upheld.

1145
00:51:48,270 --> 00:51:52,200
There's the set of companies in which
they provide a model for hosting

1146
00:51:52,200 --> 00:51:54,010
on one of the big cloud providers.

1147
00:51:54,010 --> 00:51:56,243
And this is a version of
the same thing, but I think

1148
00:51:56,243 --> 00:51:57,660
with slightly different mechanics.

1149
00:51:57,660 --> 00:52:00,410
This tends to be thought of as the
enterprise version of software.

1150
00:52:00,410 --> 00:52:03,240
And by and large, the industry
has moved over the past 20 years

1151
00:52:03,240 --> 00:52:06,480
from running my own servers to trusting
that Microsoft, or Amazon, or Google

1152
00:52:06,480 --> 00:52:07,805
can run servers for me.

1153
00:52:07,805 --> 00:52:10,930
And they say it's my private server,
even though I know they're running it.

1154
00:52:10,930 --> 00:52:12,030
And I'm OK with that.

1155
00:52:12,030 --> 00:52:14,160
And you've already started to see that--

1156
00:52:14,160 --> 00:52:16,620
Amazon with Hugging Face,
Microsoft with OpenAI ,

1157
00:52:16,620 --> 00:52:19,770
Google 2 with their own version
of Bard, are going to do these,

1158
00:52:19,770 --> 00:52:23,280
where you'll have the SAS version, and
then you'll also have the private VPC

1159
00:52:23,280 --> 00:52:23,790
version.

1160
00:52:23,790 --> 00:52:26,998
And then there's a third version that
I think we haven't yet seen practically

1161
00:52:26,998 --> 00:52:29,070
emerge, but this would
be the maximalist,

1162
00:52:29,070 --> 00:52:32,430
I want to make sure my IP is
maximally safe version of events,

1163
00:52:32,430 --> 00:52:34,530
in which you are running
your own machines,

1164
00:52:34,530 --> 00:52:36,180
you are running your own models.

1165
00:52:36,180 --> 00:52:38,760
And then the question
is, is the open source

1166
00:52:38,760 --> 00:52:41,700
and/or privately available
version of the model as good

1167
00:52:41,700 --> 00:52:43,320
as the publicly hosted one?

1168
00:52:43,320 --> 00:52:44,580
And does that matter to me?

1169
00:52:44,580 --> 00:52:46,560
And the answer is, right
now, realistically, it

1170
00:52:46,560 --> 00:52:48,030
probably matters a lot.

1171
00:52:48,030 --> 00:52:51,630
In the fullness of time you can
think of any one particular task

1172
00:52:51,630 --> 00:52:55,840
you need to achieve as requiring some
fixed point of intelligence to achieve.

1173
00:52:55,840 --> 00:52:59,370
And so over time, what we'll see is
the privately obtainable versions

1174
00:52:59,370 --> 00:53:01,320
of these models will
cross that threshold.

1175
00:53:01,320 --> 00:53:05,250
And with respect to that one task,
yeah, sure, use the open source version,

1176
00:53:05,250 --> 00:53:06,490
run it on your own machine.

1177
00:53:06,490 --> 00:53:09,570
But we'll also see the SAS
intelligence get smarter.

1178
00:53:09,570 --> 00:53:10,803
It'll probably stay ahead.

1179
00:53:10,803 --> 00:53:13,470
And then your question is, well,
which one do I care more about?

1180
00:53:13,470 --> 00:53:15,690
Do I want like the better
aggregate intelligence

1181
00:53:15,690 --> 00:53:18,540
or is my task somewhat
fixed point and I can just

1182
00:53:18,540 --> 00:53:20,460
use the open source
available one for which

1183
00:53:20,460 --> 00:53:23,293
I know it'll perform well enough
because it's crossed the threshold?

1184
00:53:23,293 --> 00:53:26,580
SIL HAMILTON: So to answer your
question specifically, yes.

1185
00:53:26,580 --> 00:53:29,550
You might be glad to know that
ChatGPT recently updated their privacy

1186
00:53:29,550 --> 00:53:32,860
policy to not use prompts
for the training process.

1187
00:53:32,860 --> 00:53:37,560
But up until now, everything went back
into the bin to be trained on again.

1188
00:53:37,560 --> 00:53:39,370
And that's just a fact.

1189
00:53:39,370 --> 00:53:40,680
So I think pizza--

1190
00:53:40,680 --> 00:53:42,510
it's now pizza time.

1191
00:53:42,510 --> 00:53:43,833
Yay, OK.

1192
00:53:43,833 --> 00:53:44,720
[APPLAUSE]

1193
00:53:44,720 --> 00:53:47,210
[INTERPOSING VOICES]

1194
00:53:47,210 --> 00:53:51,000