1 00:00:00,000 --> 00:00:01,530 SPEAKER 1: All right. 2 00:00:01,530 --> 00:00:03,425 Well, this is a CS50 Tech Talk. 3 00:00:03,425 --> 00:00:04,800 Thank you all so much for coming. 4 00:00:04,800 --> 00:00:06,930 So about a week ago, we circulated the Google Form, 5 00:00:06,930 --> 00:00:09,570 as you might have seen, at 10:52 AM. 6 00:00:09,570 --> 00:00:12,100 And by, like, 11:52 AM, we had 100 RSVP's, 7 00:00:12,100 --> 00:00:14,850 which I think is sort of testament to just how much interest there 8 00:00:14,850 --> 00:00:19,050 is in this world of AI and OpenAI and GPT, ChatGPT and the like. 9 00:00:19,050 --> 00:00:22,090 And in fact, if you're sort of generally familiar with what everyone's 10 00:00:22,090 --> 00:00:24,090 talking about but you haven't tried it yourself, 11 00:00:24,090 --> 00:00:27,300 like, this is the URL at which you can try out this tool that you've probably 12 00:00:27,300 --> 00:00:28,880 heard about, ChatGPT. 13 00:00:28,880 --> 00:00:31,380 You can sign up for a free account there and start tinkering 14 00:00:31,380 --> 00:00:33,510 with what everyone else has been tinkering with. 15 00:00:33,510 --> 00:00:36,390 And then if you're more of the app-minded type, which you probably 16 00:00:36,390 --> 00:00:39,720 are if you are here with us today, OpenAI, in particular, 17 00:00:39,720 --> 00:00:43,230 has its own low-level APIs via which you can integrate AI 18 00:00:43,230 --> 00:00:44,460 into your own software. 19 00:00:44,460 --> 00:00:46,920 But of course, as is the case in computer science, 20 00:00:46,920 --> 00:00:49,020 there's all the more abstractions and services 21 00:00:49,020 --> 00:00:51,180 that have been built on top of these technologies. 22 00:00:51,180 --> 00:00:53,790 And we're so happy today to be joined by our friends 23 00:00:53,790 --> 00:00:57,240 from McGill University and Steamship, Sil and Ted, 24 00:00:57,240 --> 00:01:00,240 from whom you'll hear in just a moment, to speak to us about how 25 00:01:00,240 --> 00:01:04,428 they are making it easier to build, to deploy, to share applications using 26 00:01:04,428 --> 00:01:05,970 some of these very same technologies. 27 00:01:05,970 --> 00:01:09,150 So our thanks to them for hosting today, our friends at Plympton, 28 00:01:09,150 --> 00:01:11,160 Jenny Lee, an alumna who's here with us today. 29 00:01:11,160 --> 00:01:14,035 But without further ado, allow me to turn things over to Ted and Sil. 30 00:01:14,035 --> 00:01:18,000 And pizza will be served shortly after 1:00 PM outside. 31 00:01:18,000 --> 00:01:19,740 All right, over to you, Ted. 32 00:01:19,740 --> 00:01:21,375 TED BENSON: Thanks a lot. 33 00:01:21,375 --> 00:01:22,000 Hey, everybody. 34 00:01:22,000 --> 00:01:23,440 It's great to be here. 35 00:01:23,440 --> 00:01:25,830 I think we've got a really good talk for you today. 36 00:01:25,830 --> 00:01:29,160 Sil is going to provide some research grounding into how it all works, 37 00:01:29,160 --> 00:01:33,742 what's going inside the brain of GPT, as well as other language models. 38 00:01:33,742 --> 00:01:35,700 And then I'll show you some examples that we're 39 00:01:35,700 --> 00:01:38,250 seeing on the ground of how people are building apps 40 00:01:38,250 --> 00:01:40,360 and what apps tend to work in the real world. 41 00:01:40,360 --> 00:01:43,212 So our perspective is we're building AWS for AI apps. 42 00:01:43,212 --> 00:01:46,170 So we get to talk to a lot of the makers who are building and deploying 43 00:01:46,170 --> 00:01:49,710 their apps, and through that, see both the experimental end of the spectrum 44 00:01:49,710 --> 00:01:52,230 and also see what kinds of apps are getting 45 00:01:52,230 --> 00:01:55,440 pushed out there and turned into companies, turned into side projects. 46 00:01:55,440 --> 00:01:59,130 We did a cool hackathon yesterday. 47 00:01:59,130 --> 00:02:02,393 Many thanks to Neiman, to David Malan and CS50 for helping us 48 00:02:02,393 --> 00:02:04,560 put all of this together, to Harvard for hosting it. 49 00:02:04,560 --> 00:02:06,870 And there were two sessions. 50 00:02:06,870 --> 00:02:08,020 Lots of folks built things. 51 00:02:08,020 --> 00:02:12,177 If you go to steamship.com/hackathon, you'll find a lot of guides, 52 00:02:12,177 --> 00:02:14,760 a lot of projects that people built. And you can follow along. 53 00:02:14,760 --> 00:02:16,470 We have a text guide, as well. 54 00:02:16,470 --> 00:02:21,310 Just as a quick plug for that, if you want to do it remotely or on your own. 55 00:02:21,310 --> 00:02:24,810 So to tee up Sil, we're going to talk about basically two things 56 00:02:24,810 --> 00:02:28,980 today that I hope you'll walk away with and really know how to then use 57 00:02:28,980 --> 00:02:30,660 as you develop and as you tinker. 58 00:02:30,660 --> 00:02:33,150 One is what is GPT and how is it working, 59 00:02:33,150 --> 00:02:35,250 get a good sense of what's going on inside of it, 60 00:02:35,250 --> 00:02:38,340 other than as just this magical machine that predicts things. 61 00:02:38,340 --> 00:02:41,410 And then two is how are people building with it, and then, importantly, 62 00:02:41,410 --> 00:02:43,680 how can I build with it, too, if you are a developer. 63 00:02:43,680 --> 00:02:45,638 And if you have CS50 background, you should 64 00:02:45,638 --> 00:02:48,180 be able to pick things up and start building some great apps. 65 00:02:48,180 --> 00:02:50,100 I've already met some of the CS50 grads yesterday, 66 00:02:50,100 --> 00:02:51,940 and the things that they were doing were pretty amazing. 67 00:02:51,940 --> 00:02:53,080 So I hope this is useful. 68 00:02:53,080 --> 00:02:55,830 I'm going to kick it over to Sil and talk about some 69 00:02:55,830 --> 00:02:58,698 of the theoretical background of GPT. 70 00:02:58,698 --> 00:02:59,490 SIL HAMILTON: Yeah. 71 00:02:59,490 --> 00:03:01,020 So thank you, Ted. 72 00:03:01,020 --> 00:03:01,740 My name is Sil. 73 00:03:01,740 --> 00:03:04,410 I'm a graduate student in the digital humanities at McGill. 74 00:03:04,410 --> 00:03:07,947 I study literature and computer science and linguistics in the same breath, 75 00:03:07,947 --> 00:03:10,530 and I've published some research over the last couple of years 76 00:03:10,530 --> 00:03:15,360 exploring what is possible with language models and culture, in particular. 77 00:03:15,360 --> 00:03:19,740 And my half, or whatever, of the presentation 78 00:03:19,740 --> 00:03:21,780 is to describe to you what is GPT. 79 00:03:21,780 --> 00:03:24,060 That's really difficult to explain in 15 minutes, 80 00:03:24,060 --> 00:03:26,580 and there are even a lot of things that we don't know. 81 00:03:26,580 --> 00:03:28,470 But a good way to approach that is the first 82 00:03:28,470 --> 00:03:33,120 consider all the things that people call GPT by, or descriptors. 83 00:03:33,120 --> 00:03:35,580 So you can call them large language models. 84 00:03:35,580 --> 00:03:37,800 You can call them universal approximators. 85 00:03:37,800 --> 00:03:42,270 From computer science, you can say that it is a generative AI. 86 00:03:42,270 --> 00:03:44,110 We know that they are neural networks. 87 00:03:44,110 --> 00:03:46,320 We know that it is an artificial intelligence. 88 00:03:46,320 --> 00:03:48,120 To some, it's a simulator of culture. 89 00:03:48,120 --> 00:03:50,100 To others, it just predicts text. 90 00:03:50,100 --> 00:03:51,450 It's also a writing assistant. 91 00:03:51,450 --> 00:03:54,180 If you've ever used ChatGPT, you can plug in a bit of your essay, 92 00:03:54,180 --> 00:03:54,990 get some feedback. 93 00:03:54,990 --> 00:03:56,430 It's amazing for that. 94 00:03:56,430 --> 00:03:57,540 It's a content generator. 95 00:03:57,540 --> 00:04:02,040 People use it to do copywriting with Jasper.ai, Sudowrite, et cetera. 96 00:04:02,040 --> 00:04:03,323 It's an agent. 97 00:04:03,323 --> 00:04:05,490 So the really hot thing right now, if you might have 98 00:04:05,490 --> 00:04:08,370 seen on Twitter, AutoGPT, Baby AGI. 99 00:04:08,370 --> 00:04:10,890 People are giving these things tools and letting 100 00:04:10,890 --> 00:04:15,090 them run a little bit free in the wild to interact with the world, computers, 101 00:04:15,090 --> 00:04:16,140 et cetera. 102 00:04:16,140 --> 00:04:18,029 We use them as chat bots, obviously. 103 00:04:18,029 --> 00:04:21,810 And the actual architecture is a transformer. 104 00:04:21,810 --> 00:04:25,770 So there's lots of ways to describe GPT, and any one of them 105 00:04:25,770 --> 00:04:29,430 is a really perfectly adequate way to begin the conversation. 106 00:04:29,430 --> 00:04:32,430 But for our purposes, we can think of it as a large language model, 107 00:04:32,430 --> 00:04:34,710 and more specifically, a language model. 108 00:04:34,710 --> 00:04:38,975 And a language model is a model of language, 109 00:04:38,975 --> 00:04:40,350 if you'll allow me the tautology. 110 00:04:40,350 --> 00:04:42,558 But really, what it does is it produces a probability 111 00:04:42,558 --> 00:04:44,650 distribution over some vocabulary. 112 00:04:44,650 --> 00:04:48,570 So let us imagine that we had the task of predicting 113 00:04:48,570 --> 00:04:51,000 the next word of the sequence "I am." 114 00:04:51,000 --> 00:04:57,810 So if I give a neural network the words "I am," what, of all words in English, 115 00:04:57,810 --> 00:04:59,990 is the next most likely word to follow? 116 00:04:59,990 --> 00:05:04,520 That, at its very core, is what GPT is trained to answer. 117 00:05:04,520 --> 00:05:08,480 And how it does it is it has a vocabulary of 50,000 words, 118 00:05:08,480 --> 00:05:12,060 and it knows roughly, given the entire internet, 119 00:05:12,060 --> 00:05:17,990 which words are likely to follow other words of those 50,000 in some sequence, 120 00:05:17,990 --> 00:05:23,540 up to 2,000 words, up to 4,000, up to 8,000, and now up to 32,000 in GPT-4. 121 00:05:23,540 --> 00:05:24,920 So you give it a sequence. 122 00:05:24,920 --> 00:05:26,060 Here, "I am." 123 00:05:26,060 --> 00:05:29,330 And over the vocabulary of 50,000 words, it 124 00:05:29,330 --> 00:05:32,640 gives you the likelihood of every single word that follows. 125 00:05:32,640 --> 00:05:34,340 So here, it's "I am." 126 00:05:34,340 --> 00:05:36,838 Perhaps the word "happy" is fairly frequent, 127 00:05:36,838 --> 00:05:38,630 so we'll give that a high probability if we 128 00:05:38,630 --> 00:05:41,480 look at all words, all utterances of English. 129 00:05:41,480 --> 00:05:42,710 It might be "I am sad." 130 00:05:42,710 --> 00:05:44,900 Maybe that's a little bit less probable. 131 00:05:44,900 --> 00:05:45,950 "I am school." 132 00:05:45,950 --> 00:05:47,270 That really should be at the end because I don't 133 00:05:47,270 --> 00:05:48,687 think anybody would ever say that. 134 00:05:48,687 --> 00:05:49,460 "I am Bjork." 135 00:05:49,460 --> 00:05:50,630 That's a little bit-- 136 00:05:50,630 --> 00:05:52,130 it's not very probable. 137 00:05:52,130 --> 00:05:54,320 It's less probable than happy/sad, but there's still 138 00:05:54,320 --> 00:05:55,768 some probability attached to it. 139 00:05:55,768 --> 00:05:58,310 And when we say it's probable, that's literally a percentage. 140 00:05:58,310 --> 00:06:03,005 That's, like, "happy" follows "I am" maybe like 5% of the time. 141 00:06:03,005 --> 00:06:07,110 "Sad" follows "I am" maybe 2% of the time, or whatever. 142 00:06:07,110 --> 00:06:11,960 So for every word that we give GPT, it tries 143 00:06:11,960 --> 00:06:15,590 to predict what the next word is across 50,000 words. 144 00:06:15,590 --> 00:06:19,420 And it gives every single one of those 50,000 words 145 00:06:19,420 --> 00:06:22,970 a number that reflects how probable it is. 146 00:06:22,970 --> 00:06:27,390 And the really magical thing that happens is you can generate new texts. 147 00:06:27,390 --> 00:06:31,130 So if you give GPT "I am" and it predicts 148 00:06:31,130 --> 00:06:37,410 "happy" as being the most probable word over 50,000, you can then append it to 149 00:06:37,410 --> 00:06:37,910 "I am." 150 00:06:37,910 --> 00:06:39,350 So now you say "I am happy." 151 00:06:39,350 --> 00:06:41,470 And you feed it into the model again. 152 00:06:41,470 --> 00:06:42,470 You sample another word. 153 00:06:42,470 --> 00:06:45,300 You feed it into the model again, and again and again and again. 154 00:06:45,300 --> 00:06:48,950 And there's lots of different ways that "I am happy," "I am sad" can go. 155 00:06:48,950 --> 00:06:51,650 And you add a little bit of randomness, and all of a sudden, 156 00:06:51,650 --> 00:06:54,650 you have a language model that can write essays, that can talk, 157 00:06:54,650 --> 00:06:57,740 and a whole lot of things, which is really unexpected 158 00:06:57,740 --> 00:07:00,480 and something that we didn't predict even five years ago. 159 00:07:00,480 --> 00:07:01,820 So this is all relevant. 160 00:07:01,820 --> 00:07:08,710 And if we move on, as we scale up the model and we give it more compute, 161 00:07:08,710 --> 00:07:13,780 in 2012, AlexNet came out, and we figured out we can give the model-- 162 00:07:13,780 --> 00:07:15,160 we can run the model in GPUs. 163 00:07:15,160 --> 00:07:16,780 So we can speed up the process. 164 00:07:16,780 --> 00:07:18,970 We can give the model lots of information 165 00:07:18,970 --> 00:07:21,850 downloaded from the internet, and it learns more and more and more. 166 00:07:21,850 --> 00:07:24,730 And the probabilities that it gives you get 167 00:07:24,730 --> 00:07:27,460 better as it sees more examples of English on the internet. 168 00:07:27,460 --> 00:07:31,062 So we have to train the model to be really large, really wide, 169 00:07:31,062 --> 00:07:33,020 and we have to train it for a really long time. 170 00:07:33,020 --> 00:07:36,640 And as we do that, the model gets more and more better and expressive 171 00:07:36,640 --> 00:07:39,700 and capable, and it also gets a little bit intelligent, 172 00:07:39,700 --> 00:07:43,510 and for reasons we don't understand. 173 00:07:43,510 --> 00:07:47,470 But also, the issue is that because it learns to replicate the internet, 174 00:07:47,470 --> 00:07:50,950 it knows how to speak in a lot of different genres of text 175 00:07:50,950 --> 00:07:52,420 and a lot of different registers. 176 00:07:52,420 --> 00:07:54,592 If you begin the conversation like, "ChatGPT, 177 00:07:54,592 --> 00:07:57,550 can you explain the moon landing to a six-year-old in a few sentences," 178 00:07:57,550 --> 00:07:58,660 GPT-3-- 179 00:07:58,660 --> 00:08:02,860 this is an example drawn from the InstructGPT paper from OpenAI-- 180 00:08:02,860 --> 00:08:07,610 GPT-3 would have just been like, "OK, so you're giving me an example, 181 00:08:07,610 --> 00:08:09,610 like explain the moon landing to a six-year-old. 182 00:08:09,610 --> 00:08:11,470 I'm going to give you a whole bunch of similar things 183 00:08:11,470 --> 00:08:13,700 because those seem very likely to come in a sequence." 184 00:08:13,700 --> 00:08:16,450 It doesn't necessarily understand that it's being asked a question 185 00:08:16,450 --> 00:08:18,220 and has to respond with an answer. 186 00:08:18,220 --> 00:08:23,800 GPT-3 did not have that apparatus that interfaced for responding to questions. 187 00:08:23,800 --> 00:08:29,270 And the scientists at OpenAI came up with a solution. 188 00:08:29,270 --> 00:08:33,080 And that's let's give it a whole bunch of examples of question and answers 189 00:08:33,080 --> 00:08:35,419 such that we first train it on the internet, 190 00:08:35,419 --> 00:08:38,480 and then we train it with a host of questions and answers 191 00:08:38,480 --> 00:08:41,210 such that it has the knowledge of the internet, 192 00:08:41,210 --> 00:08:44,000 but really knows that it has to be answering questions. 193 00:08:44,000 --> 00:08:47,570 And that is when ChatGPT was born. 194 00:08:47,570 --> 00:08:50,510 And that's when it gained 100 million users in one month. 195 00:08:50,510 --> 00:08:53,300 I think it beat TikTok's record of 20 million in one month. 196 00:08:53,300 --> 00:08:54,750 It was a huge thing. 197 00:08:54,750 --> 00:08:58,730 And for a lot of people, they went, "oh, this thing is intelligent. 198 00:08:58,730 --> 00:09:00,590 I can ask it questions. 199 00:09:00,590 --> 00:09:01,400 It answers back. 200 00:09:01,400 --> 00:09:03,440 We can work together to come to a solution." 201 00:09:03,440 --> 00:09:07,910 And that's because it's still predicting words, it's still a language model, 202 00:09:07,910 --> 00:09:13,140 but it knows to predict words in the framework of a question and answer. 203 00:09:13,140 --> 00:09:14,450 So that's what a prompt is. 204 00:09:14,450 --> 00:09:16,100 That's what instruction tuning is. 205 00:09:16,100 --> 00:09:17,510 That's a key word. 206 00:09:17,510 --> 00:09:22,510 That's what RLHF is, if you've ever seen that acronym, reinforcement 207 00:09:22,510 --> 00:09:24,520 alignment with human feedback. 208 00:09:24,520 --> 00:09:28,900 And all of those combined means that the models that are coming out today, 209 00:09:28,900 --> 00:09:31,630 the types of language predictors that are coming out today 210 00:09:31,630 --> 00:09:34,000 work to operate in a Q&A form. 211 00:09:34,000 --> 00:09:38,480 GPT-4 exclusively only has the aligned model available. 212 00:09:38,480 --> 00:09:41,950 And this is a really great, solid foundation 213 00:09:41,950 --> 00:09:44,350 to build on because you can do all sorts of things. 214 00:09:44,350 --> 00:09:46,520 You can ask it, "ChatGPT, can you do this for me? 215 00:09:46,520 --> 00:09:47,520 Can you do that for me?" 216 00:09:47,520 --> 00:09:51,080 You might have seen that OpenAI has allowed plug-in access to ChatGPT. 217 00:09:51,080 --> 00:09:52,630 So it can access Wolfram. 218 00:09:52,630 --> 00:09:53,730 It can search the web. 219 00:09:53,730 --> 00:09:56,170 It can do Instacart for you. 220 00:09:56,170 --> 00:09:57,970 It can look up recipes. 221 00:09:57,970 --> 00:10:02,560 Once the model knows that not only it has to predict language, 222 00:10:02,560 --> 00:10:05,910 but that it has to solve a problem-- 223 00:10:05,910 --> 00:10:09,380 and the problem here being give me a good answer to my question-- 224 00:10:09,380 --> 00:10:12,770 it's suddenly able to interface with the world in a really solid way. 225 00:10:12,770 --> 00:10:15,050 And from there on, there's been all sorts 226 00:10:15,050 --> 00:10:19,520 of tools that build on this Q&A form that ChatGPT uses. 227 00:10:19,520 --> 00:10:21,650 You have AutoGPT. 228 00:10:21,650 --> 00:10:22,760 You have LangChain. 229 00:10:22,760 --> 00:10:25,770 You have React. 230 00:10:25,770 --> 00:10:28,080 There was a React paper where a lot of these come from. 231 00:10:28,080 --> 00:10:34,530 And turning the model into an agent with which to achieve any ambiguous goal 232 00:10:34,530 --> 00:10:36,000 is where the future is going. 233 00:10:36,000 --> 00:10:38,160 And this is all thanks to instruction tuning. 234 00:10:38,160 --> 00:10:40,830 And with that, I think I will hand it off 235 00:10:40,830 --> 00:10:44,550 to Ted, who will be giving a demo, or something along those lines, 236 00:10:44,550 --> 00:10:48,615 for how to use GPT as an agent. 237 00:10:48,615 --> 00:10:49,115 So. 238 00:10:49,115 --> 00:10:51,930 239 00:10:51,930 --> 00:10:54,810 TED BENSON: All right, so I'm a super applied guy. 240 00:10:54,810 --> 00:11:00,030 I kind of look at things and think, OK, how can I add this LEGO, add that LEGO, 241 00:11:00,030 --> 00:11:02,410 and clip them together and build something with it. 242 00:11:02,410 --> 00:11:06,930 And right now, if you look back in computer science history, when 243 00:11:06,930 --> 00:11:10,110 you look at the kinds of things that were being done in 1970, 244 00:11:10,110 --> 00:11:13,950 right after computing was invented, the microprocessors were invented, 245 00:11:13,950 --> 00:11:17,070 people were doing research like how do I sort a list of numbers. 246 00:11:17,070 --> 00:11:19,030 And that was meaningful work, and importantly, 247 00:11:19,030 --> 00:11:20,780 it was work that's accessible to everybody 248 00:11:20,780 --> 00:11:24,540 because nobody knows what we can build with this new kind of oil, 249 00:11:24,540 --> 00:11:27,540 this new kind of electricity, this new kind of unit of computation 250 00:11:27,540 --> 00:11:28,710 we've created. 251 00:11:28,710 --> 00:11:31,560 And anything was game, and anybody could participate 252 00:11:31,560 --> 00:11:33,010 in that game to figure it out. 253 00:11:33,010 --> 00:11:37,140 And I think one of the really exciting things about GPT right now is, yes, 254 00:11:37,140 --> 00:11:38,830 in and of itself, it's amazing. 255 00:11:38,830 --> 00:11:43,210 But then, what could we do with it if we call it over and over again, 256 00:11:43,210 --> 00:11:45,083 if we build it into our algorithms and start 257 00:11:45,083 --> 00:11:46,500 to build it into broader software? 258 00:11:46,500 --> 00:11:48,420 So the world really is yours to figure out 259 00:11:48,420 --> 00:11:50,820 these fundamental questions about what could you 260 00:11:50,820 --> 00:11:55,710 do if you could script computation itself over and over again in the way 261 00:11:55,710 --> 00:11:56,850 that computers can do. 262 00:11:56,850 --> 00:11:59,340 Not just talk with it, but build things atop it. 263 00:11:59,340 --> 00:12:00,960 So we're a hosting company. 264 00:12:00,960 --> 00:12:02,040 We host apps. 265 00:12:02,040 --> 00:12:04,335 And these are just some of the things that we see. 266 00:12:04,335 --> 00:12:06,210 I'm going to show you demos of this with code 267 00:12:06,210 --> 00:12:08,463 and try to explain some of the thought process. 268 00:12:08,463 --> 00:12:10,380 But I wanted to give you a high-level overview 269 00:12:10,380 --> 00:12:12,487 of you've probably seen these on Twitter, 270 00:12:12,487 --> 00:12:15,570 but kind of when it all sorts out to the top, these are some of the things 271 00:12:15,570 --> 00:12:19,470 that we're seeing built and deployed with language models today. 272 00:12:19,470 --> 00:12:20,730 Companionship. 273 00:12:20,730 --> 00:12:23,790 That's everything from I need a friend to I need a friend with a purpose. 274 00:12:23,790 --> 00:12:24,568 I want a coach. 275 00:12:24,568 --> 00:12:27,360 I want somebody to tell me, "go to the gym and do these exercises." 276 00:12:27,360 --> 00:12:29,527 I want somebody to help me study a foreign language. 277 00:12:29,527 --> 00:12:30,660 Question answering. 278 00:12:30,660 --> 00:12:31,510 This is a big one. 279 00:12:31,510 --> 00:12:34,350 This is everything from your newsroom having a Slack bot that 280 00:12:34,350 --> 00:12:37,920 helps assist you, does this article conform 281 00:12:37,920 --> 00:12:41,478 to the style guidelines of our newsroom, all the way through to I 282 00:12:41,478 --> 00:12:43,770 need help on my homework, or hey, I have some questions 283 00:12:43,770 --> 00:12:46,860 that I want you to ask Wikipedia, combine it with something else, 284 00:12:46,860 --> 00:12:48,870 synthesize the answer, and give it to me. 285 00:12:48,870 --> 00:12:50,400 Utility functions. 286 00:12:50,400 --> 00:12:54,750 I would describe this as there's a large set of things for which 287 00:12:54,750 --> 00:12:57,510 human beings can do them if only-- 288 00:12:57,510 --> 00:12:59,937 or computers could do them if only they had access 289 00:12:59,937 --> 00:13:01,770 to language computation, language knowledge. 290 00:13:01,770 --> 00:13:04,890 An example of this would be read every tweet on Twitter. 291 00:13:04,890 --> 00:13:06,227 Tell me the ones I should read. 292 00:13:06,227 --> 00:13:09,060 That way, I only get to read the ones that actually make sense to me 293 00:13:09,060 --> 00:13:10,810 and I don't have to skim through the rest. 294 00:13:10,810 --> 00:13:11,730 Creativity. 295 00:13:11,730 --> 00:13:14,220 Image generation, text generation, storytelling, 296 00:13:14,220 --> 00:13:16,020 proposing other ways to do things. 297 00:13:16,020 --> 00:13:19,650 And then these wild experiments in kind of Baby AGI, 298 00:13:19,650 --> 00:13:23,485 as people are calling them, in which the AI itself decides what to do 299 00:13:23,485 --> 00:13:24,360 and is self-directed. 300 00:13:24,360 --> 00:13:27,360 So I'll show you examples of many of these and what the code looks like. 301 00:13:27,360 --> 00:13:29,700 And if I were you, I would think about these 302 00:13:29,700 --> 00:13:33,990 as categories within which to both think about what you might build 303 00:13:33,990 --> 00:13:37,620 and then also seek out starter projects for how you 304 00:13:37,620 --> 00:13:39,270 might go about building them online. 305 00:13:39,270 --> 00:13:42,250 306 00:13:42,250 --> 00:13:42,750 All right. 307 00:13:42,750 --> 00:13:45,450 So I'm just going to dive straight into demos and code for some of these 308 00:13:45,450 --> 00:13:48,540 because I know that's what's interesting to see as fellow builders, 309 00:13:48,540 --> 00:13:51,520 with a high-level diagram for some of these as to how it works. 310 00:13:51,520 --> 00:13:54,630 So approximately, you can think of a companionship bot 311 00:13:54,630 --> 00:13:57,520 as a friend that has a purpose to you. 312 00:13:57,520 --> 00:14:00,070 And there are many ways to build all of these things, 313 00:14:00,070 --> 00:14:02,250 but one of the ways you can build this is simply 314 00:14:02,250 --> 00:14:06,930 to wrap GPT or a language model in an endpoint that additionally injects 315 00:14:06,930 --> 00:14:10,620 into the prompt some particular perspective or some particular goal 316 00:14:10,620 --> 00:14:11,820 that you want to use. 317 00:14:11,820 --> 00:14:15,030 It really is that easy, in a way, but it's also very hard 318 00:14:15,030 --> 00:14:19,170 because you need to iterate and engineer the prompt so that it consistently 319 00:14:19,170 --> 00:14:22,060 performs the way you want it to perform. 320 00:14:22,060 --> 00:14:25,090 So a good example of this is something somebody built in the hackathon 321 00:14:25,090 --> 00:14:25,590 yesterday. 322 00:14:25,590 --> 00:14:28,007 And I just wanted to show you the project that they built. 323 00:14:28,007 --> 00:14:29,550 It was a Mandarin idiom coach. 324 00:14:29,550 --> 00:14:31,800 And I'll show you what the code looked like first. 325 00:14:31,800 --> 00:14:33,688 I'll show you the demo first. 326 00:14:33,688 --> 00:14:34,980 I think I already pulled it up. 327 00:14:34,980 --> 00:14:38,160 328 00:14:38,160 --> 00:14:39,680 Here we go. 329 00:14:39,680 --> 00:14:42,770 So the buddy that this person wanted to create 330 00:14:42,770 --> 00:14:47,180 was a friend that, if you gave it a particular problem 331 00:14:47,180 --> 00:14:49,910 you were having, it would pick a Chinese idiom, 332 00:14:49,910 --> 00:14:53,300 a four-character chengyu that described, poetically, like, 333 00:14:53,300 --> 00:14:55,907 here's a particular way you could say this, 334 00:14:55,907 --> 00:14:58,490 and it would tell it to her, so that the person who built this 335 00:14:58,490 --> 00:15:02,070 was studying Chinese and she wanted to learn more about it. 336 00:15:02,070 --> 00:15:08,080 So I might say something like, "I'm feeling very sad." 337 00:15:08,080 --> 00:15:10,430 And it would think a little bit. 338 00:15:10,430 --> 00:15:13,540 And if everything's up and running, it will 339 00:15:13,540 --> 00:15:16,330 generate one of these four-character phrases 340 00:15:16,330 --> 00:15:19,307 and it will respond to it with an example. 341 00:15:19,307 --> 00:15:21,140 Now, I don't know if this is correct or not. 342 00:15:21,140 --> 00:15:23,920 So if somebody can call me out if this is actually incorrect, 343 00:15:23,920 --> 00:15:26,170 please call me out. 344 00:15:26,170 --> 00:15:28,420 And it will then finish up with something encouraging, 345 00:15:28,420 --> 00:15:29,380 saying, "hey, you can do it. 346 00:15:29,380 --> 00:15:30,280 I know this is hard. 347 00:15:30,280 --> 00:15:30,910 Keep going." 348 00:15:30,910 --> 00:15:32,535 So let me show you how they built this. 349 00:15:32,535 --> 00:15:40,510 And I pulled up the code right here. 350 00:15:40,510 --> 00:15:44,910 So this was the particular starter Replit 351 00:15:44,910 --> 00:15:47,430 that folks were using in the hackathon yesterday. 352 00:15:47,430 --> 00:15:52,560 And we pulled things up into basically you have a wrapper around GPT. 353 00:15:52,560 --> 00:15:54,690 And there's many things you could do, but we're 354 00:15:54,690 --> 00:15:56,648 going to make it easy for you to do two things. 355 00:15:56,648 --> 00:15:59,940 One of them is to inject some personality into the prompt. 356 00:15:59,940 --> 00:16:02,590 And I'll explain what that prompt is in a second. 357 00:16:02,590 --> 00:16:04,440 And then the second is add tools that might 358 00:16:04,440 --> 00:16:08,190 go out and do a particular thing-- search the web or generate an image 359 00:16:08,190 --> 00:16:11,710 or add something to a database or fetch something from a database. 360 00:16:11,710 --> 00:16:15,210 So having done that, now you have something more than GPT. 361 00:16:15,210 --> 00:16:19,030 Now you have GPT, which we all know what it is and how we can interact with it, 362 00:16:19,030 --> 00:16:22,110 but you've also added a particular lens through which it's talking 363 00:16:22,110 --> 00:16:23,610 to you and, potentially, some tools. 364 00:16:23,610 --> 00:16:30,250 So this particular Chinese tutor, all it took to build that was four lines. 365 00:16:30,250 --> 00:16:33,510 So here's a question that I think is frying the minds of everybody 366 00:16:33,510 --> 00:16:35,700 in the industry right now. 367 00:16:35,700 --> 00:16:38,260 So is this something that we'll all do casually? 368 00:16:38,260 --> 00:16:39,260 And nobody really knows. 369 00:16:39,260 --> 00:16:42,177 Will we just all say in the future to the LLM, "hey, for the next five 370 00:16:42,177 --> 00:16:43,760 minutes, please talk like a teacher?" 371 00:16:43,760 --> 00:16:45,000 Maybe. 372 00:16:45,000 --> 00:16:48,120 But also, definitely in the meantime and maybe in the future, 373 00:16:48,120 --> 00:16:51,120 it makes sense to wrap up these personalized endpoints 374 00:16:51,120 --> 00:16:53,910 so that when I'm talking to GPT, I'm not just talking to GPT. 375 00:16:53,910 --> 00:16:55,710 I have a whole army of different buddies, 376 00:16:55,710 --> 00:16:58,082 of different companions that I can talk to. 377 00:16:58,082 --> 00:17:00,540 They're kind of human and kind of talk to me interactively, 378 00:17:00,540 --> 00:17:04,812 but because I preloaded them with, "hey, by the way, you particular, 379 00:17:04,812 --> 00:17:07,020 I want you to be a kind, helpful Chinese teacher that 380 00:17:07,020 --> 00:17:10,140 responds to every situation by explaining the chengyu that fits it. 381 00:17:10,140 --> 00:17:12,630 Speak in English and explain the chengyu and its meaning. 382 00:17:12,630 --> 00:17:15,690 Then provide a note of encouragement about learning language." 383 00:17:15,690 --> 00:17:19,770 And so just adding something like that, even if you're a non-programmer, 384 00:17:19,770 --> 00:17:26,339 you can just type deploy and it'll pop it up to the web. 385 00:17:26,339 --> 00:17:29,580 It'll take it over to a Telegram bot that you can even interact with. 386 00:17:29,580 --> 00:17:33,250 "Hey, I'm feeling too busy." 387 00:17:33,250 --> 00:17:35,830 And interact with it over Telegram, over the web. 388 00:17:35,830 --> 00:17:41,500 And this is the kind of thing that's now within reach for everybody from a CS101 389 00:17:41,500 --> 00:17:44,320 grad, so I'm using the general purpose framing, 390 00:17:44,320 --> 00:17:46,810 all the way through to professionals in the industry 391 00:17:46,810 --> 00:17:49,600 that you can do just with a little bit of manipulation 392 00:17:49,600 --> 00:17:57,020 on top of this raw unit of conversation and intelligence. 393 00:17:57,020 --> 00:18:04,840 So companionship is one of the first common types of apps that we're seeing. 394 00:18:04,840 --> 00:18:09,090 So a second kind of app that we're seeing-- and this blew up-- 395 00:18:09,090 --> 00:18:12,690 for those of you who are kind of Twitter followers, 396 00:18:12,690 --> 00:18:16,612 this blew up I think the last few months, is question answering. 397 00:18:16,612 --> 00:18:18,570 And I want to unpack a couple of different ways 398 00:18:18,570 --> 00:18:21,990 this can work because I know many of you have probably already tried 399 00:18:21,990 --> 00:18:23,730 to build some of these kinds of apps. 400 00:18:23,730 --> 00:18:25,772 There's a couple of different ways that it works. 401 00:18:25,772 --> 00:18:29,468 The general framework is a user queries GPT. 402 00:18:29,468 --> 00:18:31,260 And maybe it has general-purpose knowledge. 403 00:18:31,260 --> 00:18:33,260 Maybe it doesn't have general-purpose knowledge. 404 00:18:33,260 --> 00:18:37,950 But what you want it to say back to you is something specific about an article 405 00:18:37,950 --> 00:18:41,430 you wrote, or something specific about your course syllabus, 406 00:18:41,430 --> 00:18:45,120 or something specific about a particular set of documents from the United 407 00:18:45,120 --> 00:18:46,770 Nations on a particular topic. 408 00:18:46,770 --> 00:18:49,020 And so what you're really seeking is what we all hoped 409 00:18:49,020 --> 00:18:50,430 the customer service bot would be. 410 00:18:50,430 --> 00:18:52,590 Like, we've all interacted with these customer service bots, 411 00:18:52,590 --> 00:18:55,620 and we're kind of smashing our heads on the keyboard as we do it. 412 00:18:55,620 --> 00:18:59,430 But pretty soon, we're going to start to see very high-fidelity 413 00:18:59,430 --> 00:19:01,320 bots that interact with us comfortably. 414 00:19:01,320 --> 00:19:03,570 And this is approximately how to do it as an engineer. 415 00:19:03,570 --> 00:19:05,790 So here's your game plan as an engineer. 416 00:19:05,790 --> 00:19:11,690 Step one, take the documents that you want it to respond to. 417 00:19:11,690 --> 00:19:13,580 Step two, cut them up. 418 00:19:13,580 --> 00:19:15,950 Now, if you're an engineer, this is going to madden you. 419 00:19:15,950 --> 00:19:18,440 You don't cut them up in a way that you would hope. 420 00:19:18,440 --> 00:19:21,410 For example, you could cut them up into clean sentences 421 00:19:21,410 --> 00:19:24,590 or clean paragraphs or semantically coherent sections. 422 00:19:24,590 --> 00:19:26,210 And that would be really nice. 423 00:19:26,210 --> 00:19:28,370 Honestly, the way that most folks do it-- 424 00:19:28,370 --> 00:19:31,850 and this is a simplification that tends to be just fine-- 425 00:19:31,850 --> 00:19:35,780 is you window, you have a sliding window that goes over the document, 426 00:19:35,780 --> 00:19:38,780 and you just pull out fragments of text. 427 00:19:38,780 --> 00:19:40,647 Having pulled out those fragments of text, 428 00:19:40,647 --> 00:19:42,980 you turn them into something called an embedding vector. 429 00:19:42,980 --> 00:19:45,890 So an embedding vector is a list of numbers 430 00:19:45,890 --> 00:19:49,230 that approximate some point of meaning. 431 00:19:49,230 --> 00:19:51,800 So you've already all dealt with embedding vectors yourself 432 00:19:51,800 --> 00:19:52,475 in regular life. 433 00:19:52,475 --> 00:19:54,350 And the reason you have, and I know you have, 434 00:19:54,350 --> 00:19:57,060 is because everybody's ordered food from Yelp before. 435 00:19:57,060 --> 00:20:01,280 So when you order food from Yelp, you look at what genre of restaurant is it. 436 00:20:01,280 --> 00:20:02,630 Is it a pizza restaurant? 437 00:20:02,630 --> 00:20:03,870 Is it an Italian restaurant? 438 00:20:03,870 --> 00:20:05,330 Is it a Korean barbecue place? 439 00:20:05,330 --> 00:20:08,720 You look at how many stars does it have-- one, two, three, four, five. 440 00:20:08,720 --> 00:20:10,020 You look at where is it. 441 00:20:10,020 --> 00:20:14,000 So all of these you can think of as points in space, dimensions in space. 442 00:20:14,000 --> 00:20:17,090 Korean barbecue restaurant, four stars, near my house. 443 00:20:17,090 --> 00:20:20,680 It's a three-number vector. 444 00:20:20,680 --> 00:20:21,700 That's all this is. 445 00:20:21,700 --> 00:20:24,760 So this is a 1,000 number vector or a 10,000 number vector. 446 00:20:24,760 --> 00:20:26,920 Different models produce different size vectors. 447 00:20:26,920 --> 00:20:30,298 All it is, is chunking pieces of text, turning it 448 00:20:30,298 --> 00:20:33,340 into a vector that approximates meaning, and then you put it in something 449 00:20:33,340 --> 00:20:34,382 called a vector database. 450 00:20:34,382 --> 00:20:38,380 And a vector database is just a database that stores numbers. 451 00:20:38,380 --> 00:20:43,570 But having that database, now when I ask a question, I can search the database 452 00:20:43,570 --> 00:20:47,200 and I can say, "hey, the question was, what does CS50 teach?" 453 00:20:47,200 --> 00:20:54,180 What pieces of text in the database have vectors similar to the question, 454 00:20:54,180 --> 00:20:55,860 what does CS50 teach? 455 00:20:55,860 --> 00:20:58,350 And there's all sorts of tricks and empires 456 00:20:58,350 --> 00:21:01,440 being made on refinements of this general approach. 457 00:21:01,440 --> 00:21:06,270 But at the end, you, the developer, model it simply as thus. 458 00:21:06,270 --> 00:21:08,670 And then when you have your query, you embed it, 459 00:21:08,670 --> 00:21:12,010 you find the document fragments, and then you put them into a prompt. 460 00:21:12,010 --> 00:21:16,290 And now we're just back to the personality, the companionship bot. 461 00:21:16,290 --> 00:21:17,640 Now it's just a prompt. 462 00:21:17,640 --> 00:21:20,860 And the prompt is, "you're an expert in answering questions. 463 00:21:20,860 --> 00:21:25,800 Please answer user-provided question using source documents, 464 00:21:25,800 --> 00:21:27,090 results from the database." 465 00:21:27,090 --> 00:21:28,493 That's it. 466 00:21:28,493 --> 00:21:31,410 So after all of these decades of engineering of these customer service 467 00:21:31,410 --> 00:21:34,110 bots, it turns out, with a couple of lines of code, you can build this. 468 00:21:34,110 --> 00:21:34,902 So let me show you. 469 00:21:34,902 --> 00:21:38,990 I made one just before the class with the CS50 syllabus. 470 00:21:38,990 --> 00:21:43,770 So we can pull that up. 471 00:21:43,770 --> 00:21:46,980 And I can say I added the PDF right here. 472 00:21:46,980 --> 00:21:48,500 So I just searched-- 473 00:21:48,500 --> 00:21:51,260 I apologize, I don't know if it's an accurate or recent syllabus. 474 00:21:51,260 --> 00:21:53,780 I just searched the web for CS50 syllabus PDF. 475 00:21:53,780 --> 00:21:55,670 I put the URL in here. 476 00:21:55,670 --> 00:21:56,900 It loaded it into here. 477 00:21:56,900 --> 00:21:58,970 This is just like a 100-line piece of code 478 00:21:58,970 --> 00:22:02,150 deployed that will now let me talk to it. 479 00:22:02,150 --> 00:22:07,480 And I can say, "what will CS50 teach me?" 480 00:22:07,480 --> 00:22:10,510 So under the hood now, what's happening is exactly what that slide just 481 00:22:10,510 --> 00:22:11,010 showed you. 482 00:22:11,010 --> 00:22:13,210 It takes that question, "What will CS50 teach me." 483 00:22:13,210 --> 00:22:14,980 It turns it into a vector. 484 00:22:14,980 --> 00:22:18,700 That vector approximates, without exactly representing, 485 00:22:18,700 --> 00:22:20,920 the meaning of that question. 486 00:22:20,920 --> 00:22:24,070 It looks into a vector database that Steamship 487 00:22:24,070 --> 00:22:27,580 hosts of fragments from that PDF. 488 00:22:27,580 --> 00:22:30,100 And then it pulls out a document and then passes it 489 00:22:30,100 --> 00:22:34,330 to a prompt that says, "hey, you're an expert at answering questions. 490 00:22:34,330 --> 00:22:36,910 Someone has asked you what does CS50 teach. 491 00:22:36,910 --> 00:22:40,840 Please answer it using only the source documents and source materials 492 00:22:40,840 --> 00:22:41,780 I've provided." 493 00:22:41,780 --> 00:22:45,232 Now, those source materials are dynamically loaded into the prompt. 494 00:22:45,232 --> 00:22:46,690 It's just basic prompt engineering. 495 00:22:46,690 --> 00:22:49,090 And I want to keep harping back onto that. 496 00:22:49,090 --> 00:22:53,410 What's amazing about right now as builders is that so many things just 497 00:22:53,410 --> 00:22:59,260 boil down to very creative, tactical rearrangement of prompts, 498 00:22:59,260 --> 00:23:01,658 and then using those over and over again in an algorithm 499 00:23:01,658 --> 00:23:02,950 and putting that into software. 500 00:23:02,950 --> 00:23:06,250 So the result-- and again, it could be lying, it could be making things up, 501 00:23:06,250 --> 00:23:09,280 it could be hallucinating-- is "CS50 will teach students how to think 502 00:23:09,280 --> 00:23:10,840 algorithmically and solve problems efficiently, 503 00:23:10,840 --> 00:23:13,210 focusing on topics such as abstraction," da-da-da-da-da. 504 00:23:13,210 --> 00:23:16,390 And then it returns the source document from which it was found. 505 00:23:16,390 --> 00:23:19,150 So this is another big category of which there 506 00:23:19,150 --> 00:23:23,230 are tons of potential applications because you 507 00:23:23,230 --> 00:23:25,110 can repeat for each context. 508 00:23:25,110 --> 00:23:27,790 You can create arbitrarily many of these once it's software, 509 00:23:27,790 --> 00:23:31,610 because once it's software, you can just repeat it over and over again. 510 00:23:31,610 --> 00:23:35,980 So for your dorm, for your club, for your Slack, for your Telegram. 511 00:23:35,980 --> 00:23:38,950 You can start to begin putting pieces of information 512 00:23:38,950 --> 00:23:40,610 in and then responding to it. 513 00:23:40,610 --> 00:23:42,110 And it doesn't have to be documents. 514 00:23:42,110 --> 00:23:46,140 You can also load it straight into the prompt. 515 00:23:46,140 --> 00:23:47,680 I think I have it pulled up here. 516 00:23:47,680 --> 00:23:50,150 And if I don't, I'll just skip it. 517 00:23:50,150 --> 00:23:52,070 Oh, here we go. 518 00:23:52,070 --> 00:23:55,430 One other way you can do question answering, 519 00:23:55,430 --> 00:23:57,650 because I think it's healthy to always encourage 520 00:23:57,650 --> 00:24:00,470 the simplest possible approach to something, 521 00:24:00,470 --> 00:24:03,000 you don't need to engineer this giant system. 522 00:24:03,000 --> 00:24:04,250 It's great to have a database. 523 00:24:04,250 --> 00:24:05,300 It's great to use embeddings. 524 00:24:05,300 --> 00:24:06,560 It's great to use this big approach. 525 00:24:06,560 --> 00:24:07,060 It's fancy. 526 00:24:07,060 --> 00:24:07,730 It scales. 527 00:24:07,730 --> 00:24:09,410 You can do a lot of things. 528 00:24:09,410 --> 00:24:13,790 But you can also get away with a lot by just pushing it all into a prompt. 529 00:24:13,790 --> 00:24:17,678 And as an engineer, [INAUDIBLE],, one of our teammates here always says, 530 00:24:17,678 --> 00:24:19,220 "engineers should aspire to be lazy." 531 00:24:19,220 --> 00:24:20,825 And I couldn't agree more. 532 00:24:20,825 --> 00:24:23,660 You, as an engineer, should want to set yourself 533 00:24:23,660 --> 00:24:27,260 up so that you can pursue the lazy path to something. 534 00:24:27,260 --> 00:24:31,400 So here's how you might do the equivalent of a question answering 535 00:24:31,400 --> 00:24:32,570 system with a prompt alone. 536 00:24:32,570 --> 00:24:35,060 Let's say you have 30 friends. 537 00:24:35,060 --> 00:24:37,070 And each friend is good at a particular thing. 538 00:24:37,070 --> 00:24:40,400 Or you can-- this is isomorphic to many other problems. 539 00:24:40,400 --> 00:24:42,920 You can simply just say, "hey, I know certain things. 540 00:24:42,920 --> 00:24:44,660 Here's the things I know. 541 00:24:44,660 --> 00:24:47,330 A user is going to ask me something. 542 00:24:47,330 --> 00:24:48,980 How should we respond?" 543 00:24:48,980 --> 00:24:50,690 And then you load that into an agent. 544 00:24:50,690 --> 00:24:53,360 That agent has access to GPT. 545 00:24:53,360 --> 00:24:54,650 You can ship-deploy it. 546 00:24:54,650 --> 00:24:57,650 And now you've got a bot that you can connect to Telegram, 547 00:24:57,650 --> 00:24:59,150 you can connect to Slack. 548 00:24:59,150 --> 00:25:01,565 And that bot, now, it won't always give you 549 00:25:01,565 --> 00:25:03,440 the right answer, because at a certain level, 550 00:25:03,440 --> 00:25:06,470 we can't control the variance of the model underneath, 551 00:25:06,470 --> 00:25:10,220 but it will tend to answer with respect to this list. 552 00:25:10,220 --> 00:25:13,490 And the degree to which it tends is, to a certain extent, something 553 00:25:13,490 --> 00:25:16,400 that both industry is working on to just give everybody 554 00:25:16,400 --> 00:25:19,700 as a capacity, but also you, doing prompt engineering, 555 00:25:19,700 --> 00:25:23,360 to tighten up the error bars on it. 556 00:25:23,360 --> 00:25:26,910 557 00:25:26,910 --> 00:25:29,120 So I'll show you just a few more examples. 558 00:25:29,120 --> 00:25:31,880 And then in about eight minutes, I'll turn it over to questions, 559 00:25:31,880 --> 00:25:34,380 because I'm sure you've got a lot about how to build things. 560 00:25:34,380 --> 00:25:36,830 So just to give you a sense of where we are. 561 00:25:36,830 --> 00:25:40,990 562 00:25:40,990 --> 00:25:44,460 This is one, I don't have a demo for you, but if you were to come to me 563 00:25:44,460 --> 00:25:48,060 and you were to say, "Ted, I want a weekend hustle, man. 564 00:25:48,060 --> 00:25:49,410 What should I build?" 565 00:25:49,410 --> 00:25:50,970 Holy moly. 566 00:25:50,970 --> 00:25:55,050 There are a set of applications that I would describe as utility functions. 567 00:25:55,050 --> 00:25:57,450 I don't like that name because it doesn't sound exciting, 568 00:25:57,450 --> 00:25:59,010 and this is really exciting. 569 00:25:59,010 --> 00:26:02,520 And it's low-hanging fruits that automate tasks that 570 00:26:02,520 --> 00:26:04,230 require basic language understanding. 571 00:26:04,230 --> 00:26:08,010 So examples for this are generate a unit test. 572 00:26:08,010 --> 00:26:10,632 I don't know how many of you have ever been writing tests 573 00:26:10,632 --> 00:26:12,090 and you're just like, "oh, come on. 574 00:26:12,090 --> 00:26:13,260 I can get through this. 575 00:26:13,260 --> 00:26:13,920 I can get through this." 576 00:26:13,920 --> 00:26:16,740 If you're a person who likes writing tests, you're a lucky individual. 577 00:26:16,740 --> 00:26:19,490 Looking up the documentation for a function, rewriting a function, 578 00:26:19,490 --> 00:26:23,460 making something conform to your company guidelines, doing a brand check, 579 00:26:23,460 --> 00:26:25,860 all of these things are things that are kind 580 00:26:25,860 --> 00:26:31,770 of relatively context-free operations, or scoped-context operations 581 00:26:31,770 --> 00:26:36,170 on a piece of information that requires linguistic understanding. 582 00:26:36,170 --> 00:26:39,530 And really, you can think of them as something 583 00:26:39,530 --> 00:26:42,320 that is now available to you as a software builder, 584 00:26:42,320 --> 00:26:45,980 as a weekend project builder, as a startup builder. 585 00:26:45,980 --> 00:26:48,590 And you just have to build the interface around it 586 00:26:48,590 --> 00:26:51,890 and present it to other people in a context in which it's 587 00:26:51,890 --> 00:26:54,150 meaningful for them to consume. 588 00:26:54,150 --> 00:26:57,300 And so the space of this is extraordinary. 589 00:26:57,300 --> 00:27:00,560 I mean, it's the space of all human endeavor, now with this new tool, 590 00:27:00,560 --> 00:27:02,150 I think is the way to think about it. 591 00:27:02,150 --> 00:27:04,910 People often joke about how, when you're building a company, when you're 592 00:27:04,910 --> 00:27:07,220 building a project, you don't want to start with the hammer 593 00:27:07,220 --> 00:27:09,450 because you want to start with the problem instead. 594 00:27:09,450 --> 00:27:12,170 And it's generally true, but my God, we've 595 00:27:12,170 --> 00:27:13,640 just got a really cool new hammer. 596 00:27:13,640 --> 00:27:16,518 And to a certain extent, I would encourage you to at least casually, 597 00:27:16,518 --> 00:27:18,560 on the weekends, run around and hit stuff with it 598 00:27:18,560 --> 00:27:21,320 and see what can happen from a builder's, from a tinkerer's, 599 00:27:21,320 --> 00:27:23,735 from an experimentalist's point of view. 600 00:27:23,735 --> 00:27:27,790 601 00:27:27,790 --> 00:27:30,970 And then the final one is creativity. 602 00:27:30,970 --> 00:27:32,740 This is another huge mega app. 603 00:27:32,740 --> 00:27:36,040 Now, I primarily live in the text world, and so I'm 604 00:27:36,040 --> 00:27:37,840 going to talk about text-based things. 605 00:27:37,840 --> 00:27:42,700 I think, so far, this has mostly been growing in the imagery world 606 00:27:42,700 --> 00:27:45,610 because we're such visual creatures and the images you can generate 607 00:27:45,610 --> 00:27:48,120 are just staggering with AI. 608 00:27:48,120 --> 00:27:49,870 It certainly brings up a lot of questions, 609 00:27:49,870 --> 00:27:53,050 too, around IP and artistic style. 610 00:27:53,050 --> 00:27:57,548 But the template for this, if you're a builder, that we're seeing in the wild 611 00:27:57,548 --> 00:27:58,840 is approximately the following. 612 00:27:58,840 --> 00:28:01,570 And the thing I want to point out is domain knowledge here. 613 00:28:01,570 --> 00:28:03,278 This is really the purpose of this slide, 614 00:28:03,278 --> 00:28:06,740 is to touch on the importance of the domain knowledge. 615 00:28:06,740 --> 00:28:12,960 So many people approximately find the creative process as follows. 616 00:28:12,960 --> 00:28:15,580 Come up with a big idea. 617 00:28:15,580 --> 00:28:18,370 Overgenerate possibilities. 618 00:28:18,370 --> 00:28:21,100 Edit down what you overgenerated. 619 00:28:21,100 --> 00:28:22,210 Repeat. 620 00:28:22,210 --> 00:28:22,870 Right? 621 00:28:22,870 --> 00:28:26,495 Anybody who's been a writer knows, when you write, you write way too much, 622 00:28:26,495 --> 00:28:28,120 and then you have to delete lots of it. 623 00:28:28,120 --> 00:28:30,120 And then you revise, and you write way too much, 624 00:28:30,120 --> 00:28:31,580 and you have to delete lots of it. 625 00:28:31,580 --> 00:28:34,540 This particular task is fantastic for AI. 626 00:28:34,540 --> 00:28:36,868 One of the reasons it's fantastic for AI is 627 00:28:36,868 --> 00:28:38,410 because it allows the AI to be wrong. 628 00:28:38,410 --> 00:28:40,730 You know, you've preagreed you're going to delete lots of it. 629 00:28:40,730 --> 00:28:42,730 And so if you preagree, "hey, I'm just going 630 00:28:42,730 --> 00:28:46,270 to generate five possibilities of the story I might tell, 631 00:28:46,270 --> 00:28:48,430 five possibilities of the advertising headline, 632 00:28:48,430 --> 00:28:52,512 five possibilities of what I might write my thesis on," 633 00:28:52,512 --> 00:28:54,970 you preagreed it's OK if it's a little long because you are 634 00:28:54,970 --> 00:28:56,860 going to be the editor that steps in. 635 00:28:56,860 --> 00:28:59,890 And here's the thing that you really should bring to the table, 636 00:28:59,890 --> 00:29:02,020 is don't think about this as a technical activity. 637 00:29:02,020 --> 00:29:06,700 Think about this as your opportunity not to put GPT in charge. 638 00:29:06,700 --> 00:29:09,910 Instead, for you to grasp the steering wheel tighter-- 639 00:29:09,910 --> 00:29:11,290 I think, at least-- 640 00:29:11,290 --> 00:29:14,350 in Python or the language you're using to program 641 00:29:14,350 --> 00:29:18,962 because you have the domain knowledge to wield GPT in the generation of those. 642 00:29:18,962 --> 00:29:21,170 So let me show you an example of what I mean by that. 643 00:29:21,170 --> 00:29:27,760 So this is a cool app that someone created for the Writing Atlas Project. 644 00:29:27,760 --> 00:29:31,800 So Writing Atlas is a set of short stories. 645 00:29:31,800 --> 00:29:35,160 And you can think of it as Good Reads for short stories. 646 00:29:35,160 --> 00:29:37,843 So you can go in here, you can browse different stories. 647 00:29:37,843 --> 00:29:40,260 And this was something somebody created where you can type 648 00:29:40,260 --> 00:29:42,252 in a story description that you like. 649 00:29:42,252 --> 00:29:44,460 And this is going to take about a minute to generate, 650 00:29:44,460 --> 00:29:46,252 so I'm going to talk while it's generating. 651 00:29:46,252 --> 00:29:51,253 And while it's working, what it's doing-- 652 00:29:51,253 --> 00:29:52,920 and I'll show you the code in a second-- 653 00:29:52,920 --> 00:29:56,273 is it's searching through the collection of stories for similar stories. 654 00:29:56,273 --> 00:29:58,440 And here's where the domain knowledge part comes in. 655 00:29:58,440 --> 00:30:02,370 Then it uses GPT to look at what it was that you wanted 656 00:30:02,370 --> 00:30:06,120 and use knowledge of how an editor, how a bookseller thinks 657 00:30:06,120 --> 00:30:09,060 to generate a set of suggestions specifically 658 00:30:09,060 --> 00:30:12,240 through the lens of that perspective, with the goal of writing 659 00:30:12,240 --> 00:30:16,200 that beautiful handwritten note that we sometimes see in a local bookstore 660 00:30:16,200 --> 00:30:19,590 tacked on underneath a book. 661 00:30:19,590 --> 00:30:21,940 And so it doesn't just say, "hey, you might like this, 662 00:30:21,940 --> 00:30:25,090 here's a general-purpose reason why you might like this," 663 00:30:25,090 --> 00:30:27,840 but specifically "here's why you might like this," 664 00:30:27,840 --> 00:30:29,550 with respect to what you gave it. 665 00:30:29,550 --> 00:30:32,670 It's either stalling out or it's taking a long time. 666 00:30:32,670 --> 00:30:34,600 Oh, there we go. 667 00:30:34,600 --> 00:30:36,750 So here's its suggestions. 668 00:30:36,750 --> 00:30:40,140 And in particular, these things, these are 669 00:30:40,140 --> 00:30:43,440 things that only a human could know, at least for now. 670 00:30:43,440 --> 00:30:47,340 Two humans, specifically, the human who said they wanted to read a story-- 671 00:30:47,340 --> 00:30:49,620 that's the text that came in-- and then the human 672 00:30:49,620 --> 00:30:54,030 who added domain knowledge to script a sequence of interactions 673 00:30:54,030 --> 00:30:56,790 with the language model so that you could provide 674 00:30:56,790 --> 00:30:59,550 very targeted reasoning over something that 675 00:30:59,550 --> 00:31:01,510 was informed by that domain knowledge. 676 00:31:01,510 --> 00:31:05,460 So for these utility apps, bring your domain knowledge. 677 00:31:05,460 --> 00:31:09,590 678 00:31:09,590 --> 00:31:12,110 Let me actually show you how this looks in code 679 00:31:12,110 --> 00:31:15,740 because I think it's useful to see how simple and accessible this is. 680 00:31:15,740 --> 00:31:18,320 This is really a set of prompts. 681 00:31:18,320 --> 00:31:22,017 So why might they like a particular location? 682 00:31:22,017 --> 00:31:23,600 Well, here's the prompt that did that. 683 00:31:23,600 --> 00:31:25,780 This is an open source project. 684 00:31:25,780 --> 00:31:27,730 And it has a bunch of examples, and then it 685 00:31:27,730 --> 00:31:31,050 says, well, here's the one that we're interested in. 686 00:31:31,050 --> 00:31:31,980 Here's the audience. 687 00:31:31,980 --> 00:31:35,030 Here's a couple of examples of why might people like a particular thing, 688 00:31:35,030 --> 00:31:36,013 in terms of audience. 689 00:31:36,013 --> 00:31:37,055 It's just another prompt. 690 00:31:37,055 --> 00:31:42,240 691 00:31:42,240 --> 00:31:43,380 Same for topic. 692 00:31:43,380 --> 00:31:44,620 Same for explanation. 693 00:31:44,620 --> 00:31:50,770 And if you go down here and look at how it was done, suggesting the story is-- 694 00:31:50,770 --> 00:31:53,920 what is this, line 174 to line 203-- 695 00:31:53,920 --> 00:31:56,370 it really is-- and again, over and over again, 696 00:31:56,370 --> 00:31:59,120 I want to impress upon you-- this really is within reach. 697 00:31:59,120 --> 00:32:04,360 It's really just, what, 20 odd lines of step one, search 698 00:32:04,360 --> 00:32:06,430 in the database for similar stories. 699 00:32:06,430 --> 00:32:11,020 Step two, given that I have similar stories, pull out the data. 700 00:32:11,020 --> 00:32:16,780 Step three, with my domain knowledge in Python, now run these prompts. 701 00:32:16,780 --> 00:32:19,100 Step four, prepare that into an output. 702 00:32:19,100 --> 00:32:24,550 So the thing we're scripting itself is some approximation of human cognition, 703 00:32:24,550 --> 00:32:26,530 if you're willing to go there metaphorically. 704 00:32:26,530 --> 00:32:28,690 We're not sure-- I'm not going to weigh in 705 00:32:28,690 --> 00:32:36,100 on where we are in the is OpenAI a life form argument. 706 00:32:36,100 --> 00:32:36,970 All right. 707 00:32:36,970 --> 00:32:40,825 One kind of really far out there thing, and then I'll 708 00:32:40,825 --> 00:32:43,450 tie it up for questions, because I know there's probably a lot. 709 00:32:43,450 --> 00:32:47,410 And I also want to make sure you get great pizza in your bellies. 710 00:32:47,410 --> 00:32:52,383 And that is Baby AGI, AutoGPT is what you might 711 00:32:52,383 --> 00:32:53,800 have heard them called on Twitter. 712 00:32:53,800 --> 00:32:56,050 I think of them as multi-step planning bots. 713 00:32:56,050 --> 00:33:00,940 So everything I showed you so far was approximately one-shot interactions 714 00:33:00,940 --> 00:33:02,510 with GPT. 715 00:33:02,510 --> 00:33:05,170 So this is the user says they want something, 716 00:33:05,170 --> 00:33:10,330 and then either Python mediates interactions with GPT or GPT 717 00:33:10,330 --> 00:33:13,600 itself does some things with the inflection of a personality 718 00:33:13,600 --> 00:33:16,240 that you've added from some prompt engineering. 719 00:33:16,240 --> 00:33:17,830 Really useful. 720 00:33:17,830 --> 00:33:19,150 Pretty easy to control. 721 00:33:19,150 --> 00:33:22,150 If you want to go to production, if you want to build a weekend project, 722 00:33:22,150 --> 00:33:26,370 if you want to build a company, that's a great way to do it right now. 723 00:33:26,370 --> 00:33:27,993 This is wild. 724 00:33:27,993 --> 00:33:29,910 And if you haven't seen this stuff on Twitter, 725 00:33:29,910 --> 00:33:32,077 I would definitely recommend going to search for it. 726 00:33:32,077 --> 00:33:33,720 This is what happens-- 727 00:33:33,720 --> 00:33:37,740 the simple way to put it is-- if you put GPT in a for loop. 728 00:33:37,740 --> 00:33:42,040 If you let GPT talk to itself and then tell itself what to do. 729 00:33:42,040 --> 00:33:46,530 So it's an emergent behavior. 730 00:33:46,530 --> 00:33:49,660 And like all emergent behaviors, it starts with a few simple steps. 731 00:33:49,660 --> 00:33:53,490 In Conway's Game of Life, many elements of reality 732 00:33:53,490 --> 00:33:56,340 turn out to be math equations that fit on a t-shirt, 733 00:33:56,340 --> 00:33:59,280 but then when you play them forward in time, they generate DNA. 734 00:33:59,280 --> 00:34:00,910 They generate human life. 735 00:34:00,910 --> 00:34:07,180 So this is approximately, step one, take a human objective. 736 00:34:07,180 --> 00:34:11,199 Step two, your first task is to write yourself a list of steps. 737 00:34:11,199 --> 00:34:12,639 And here's the critical part-- 738 00:34:12,639 --> 00:34:14,159 repeat. 739 00:34:14,159 --> 00:34:16,000 Now do the list of steps. 740 00:34:16,000 --> 00:34:20,080 Now, you have to embody your agent with the ability to do things. 741 00:34:20,080 --> 00:34:22,949 So it's really only limited to do what you give it the tools to do 742 00:34:22,949 --> 00:34:24,540 and what it has the skills to do. 743 00:34:24,540 --> 00:34:27,570 So obviously, this is still very much a set 744 00:34:27,570 --> 00:34:29,505 of experiments that are running right now. 745 00:34:29,505 --> 00:34:32,130 But it's something that we'll see unfold over the coming years. 746 00:34:32,130 --> 00:34:34,170 And this is the scenario in which Python stops 747 00:34:34,170 --> 00:34:37,139 becoming so important because we've given it the ability to actually 748 00:34:37,139 --> 00:34:39,210 self-direct what it's doing. 749 00:34:39,210 --> 00:34:41,094 And then it finally gives you a result. 750 00:34:41,094 --> 00:34:44,219 And I want to give you an example still of just, again, impressing upon you 751 00:34:44,219 --> 00:34:48,090 how much of this is prompt engineering, which is wild, how little code this is. 752 00:34:48,090 --> 00:34:53,670 Let me show you what Baby AGI looks like. 753 00:34:53,670 --> 00:34:57,560 So here is a Baby AGI that you can connect to Telegram. 754 00:34:57,560 --> 00:35:01,150 755 00:35:01,150 --> 00:35:03,512 And this is an agent that has two tools. 756 00:35:03,512 --> 00:35:05,470 So I haven't explained to you what an agent is. 757 00:35:05,470 --> 00:35:07,150 I haven't explained to you what tools are. 758 00:35:07,150 --> 00:35:09,150 I'll give you a quick, one-sentence description. 759 00:35:09,150 --> 00:35:14,770 An agent is just a word to mean GPT plus some bigger body in which it's living. 760 00:35:14,770 --> 00:35:16,240 Maybe that body has a personality. 761 00:35:16,240 --> 00:35:17,080 Maybe it has tools. 762 00:35:17,080 --> 00:35:19,990 Maybe it has Python mediating its experience with other things. 763 00:35:19,990 --> 00:35:23,980 Tools are simply ways in which the agent can choose to do things. 764 00:35:23,980 --> 00:35:26,440 Like, imagine if GPT could say, "order a pizza." 765 00:35:26,440 --> 00:35:28,780 And instead of you seeing the text "order a pizza," 766 00:35:28,780 --> 00:35:30,250 that caused a pizza to be ordered. 767 00:35:30,250 --> 00:35:32,080 That's a tool. 768 00:35:32,080 --> 00:35:33,330 So these are two tools it has. 769 00:35:33,330 --> 00:35:35,190 One tool is generate a to-do list. 770 00:35:35,190 --> 00:35:38,025 One tool is do a search on the web. 771 00:35:38,025 --> 00:35:42,240 772 00:35:42,240 --> 00:35:46,570 And then down here, it has a prompt saying, "hey, 773 00:35:46,570 --> 00:35:50,010 your goal is to build a task list and then do that task list." 774 00:35:50,010 --> 00:35:53,300 And then this is just placed into a harness that does it over and over 775 00:35:53,300 --> 00:35:53,800 again. 776 00:35:53,800 --> 00:35:56,940 So after the next task, kind of on cue, the results of that task. 777 00:35:56,940 --> 00:35:58,810 And keep it going. 778 00:35:58,810 --> 00:36:01,830 And so in doing that, you get this kickstarted loop, 779 00:36:01,830 --> 00:36:05,790 where, essentially, you kickstart it and then the agent is talking to itself. 780 00:36:05,790 --> 00:36:07,270 Talking to itself. 781 00:36:07,270 --> 00:36:10,948 So this, unless I'm wrong, I don't think this has yet reached production, 782 00:36:10,948 --> 00:36:12,990 in terms of what we're seeing in the field of how 783 00:36:12,990 --> 00:36:14,440 people are deploying software. 784 00:36:14,440 --> 00:36:17,885 But if you want to dive into the wildest part of experimentation, 785 00:36:17,885 --> 00:36:20,010 this is definitely one of the places you can start. 786 00:36:20,010 --> 00:36:21,810 And it's really within reach. 787 00:36:21,810 --> 00:36:25,462 All you have to do is download one of the starter projects for it. 788 00:36:25,462 --> 00:36:27,420 And you can kind of see right in the prompting, 789 00:36:27,420 --> 00:36:31,180 here's how you kickstart that process of iteration. 790 00:36:31,180 --> 00:36:37,880 791 00:36:37,880 --> 00:36:40,310 All right, so I know that was super high-level. 792 00:36:40,310 --> 00:36:41,557 I hope it was useful. 793 00:36:41,557 --> 00:36:44,390 It's, I think, from the field, from the bottoms up what we're seeing 794 00:36:44,390 --> 00:36:48,740 and what people are building, kind of these high-level categories of apps 795 00:36:48,740 --> 00:36:51,050 that people are making, all of these apps 796 00:36:51,050 --> 00:36:52,980 are apps that are within reach to everybody, 797 00:36:52,980 --> 00:36:54,770 which is really, really exciting. 798 00:36:54,770 --> 00:36:59,750 And I suggest Twitter as a great place to hang out and build things. 799 00:36:59,750 --> 00:37:02,700 There's a lot of AI builders on Twitter publishing. 800 00:37:02,700 --> 00:37:05,810 And I think we've got a couple of minutes before pizza is arriving. 801 00:37:05,810 --> 00:37:06,660 Maybe 10 minutes. 802 00:37:06,660 --> 00:37:07,380 Keep on going? 803 00:37:07,380 --> 00:37:07,880 Oh. 804 00:37:07,880 --> 00:37:11,210 So if there's any questions, why don't we kick it to that, because I'm sure 805 00:37:11,210 --> 00:37:13,740 there's some questions that you all have. 806 00:37:13,740 --> 00:37:14,970 I guess ended a little early. 807 00:37:14,970 --> 00:37:15,830 Yes. 808 00:37:15,830 --> 00:37:18,740 AUDIENCE: Yeah, so I have a question about hallucinations. 809 00:37:18,740 --> 00:37:22,800 And so when you're building these sorts of applications in the apps, 810 00:37:22,800 --> 00:37:24,420 for example, let's say-- 811 00:37:24,420 --> 00:37:27,200 I'm giving you, like, a physics problem, from a [? pset, ?] 812 00:37:27,200 --> 00:37:28,380 and we want to do that. 813 00:37:28,380 --> 00:37:29,180 TED BENSON: Yeah. 814 00:37:29,180 --> 00:37:32,702 AUDIENCE: And it's, 40% of the time, just wrong. 815 00:37:32,702 --> 00:37:33,410 TED BENSON: Yeah. 816 00:37:33,410 --> 00:37:35,600 AUDIENCE: Do you have any actionable recommendations 817 00:37:35,600 --> 00:37:38,750 that developers should be doing to make it hallucinate any less? 818 00:37:38,750 --> 00:37:41,870 Or maybe even things that OpenAI on the back end 819 00:37:41,870 --> 00:37:43,850 should be doing to reduce hallucinations? 820 00:37:43,850 --> 00:37:47,670 Would it be something where you use RLHF? 821 00:37:47,670 --> 00:37:49,170 Any thoughts there? 822 00:37:49,170 --> 00:37:52,310 TED BENSON: So the question was how-- approximately, how do you 823 00:37:52,310 --> 00:37:53,810 manage the hallucination problem? 824 00:37:53,810 --> 00:37:58,830 If you give it a physics lecture, and you ask it a question, on the one hand, 825 00:37:58,830 --> 00:38:00,680 it appears to be answering you correctly. 826 00:38:00,680 --> 00:38:05,060 On the other hand, it appears to be wrong to an expert's eye 40% 827 00:38:05,060 --> 00:38:07,280 of the time, 70% of the time, 10% of the time. 828 00:38:07,280 --> 00:38:08,490 It's a huge problem. 829 00:38:08,490 --> 00:38:10,580 And then what are some ways as developers, 830 00:38:10,580 --> 00:38:13,160 practically, you can use to mitigate that. 831 00:38:13,160 --> 00:38:14,040 I'll give an answer. 832 00:38:14,040 --> 00:38:15,832 Sil, you may have some specific things too. 833 00:38:15,832 --> 00:38:17,582 So one high-level answer is the same thing 834 00:38:17,582 --> 00:38:19,540 that makes these things capable of synthesizing 835 00:38:19,540 --> 00:38:22,130 information is part of the reason why it hallucinates for you. 836 00:38:22,130 --> 00:38:25,200 So it's hard to have your cake and eat it too to a certain extent. 837 00:38:25,200 --> 00:38:26,700 So this is part of the game. 838 00:38:26,700 --> 00:38:28,340 In fact, humans do it too. 839 00:38:28,340 --> 00:38:32,942 People talk about just folks who are too aggressive in their assumptions 840 00:38:32,942 --> 00:38:33,650 about knowledge-- 841 00:38:33,650 --> 00:38:35,630 I can't remember the name for that phenomenon-- 842 00:38:35,630 --> 00:38:36,797 where you'll just say stuff. 843 00:38:36,797 --> 00:38:38,480 So we do it to. 844 00:38:38,480 --> 00:38:40,762 Some things you can do are-- 845 00:38:40,762 --> 00:38:43,220 kind of a range of activities-- depending on how much money 846 00:38:43,220 --> 00:38:45,845 you're willing to spend, how much technical expertise you have, 847 00:38:45,845 --> 00:38:48,620 it can range from fine tuning a model, to practically-- 848 00:38:48,620 --> 00:38:51,710 I'm in the applied world, so I'm very much in a world of duct tape 849 00:38:51,710 --> 00:38:53,132 and how developers get stuff done. 850 00:38:53,132 --> 00:38:54,090 So some of the answers. 851 00:38:54,090 --> 00:38:56,173 I'll give you are sort of very duct tapey answers. 852 00:38:56,173 --> 00:38:59,060 Giving it examples tends to work for acute things. 853 00:38:59,060 --> 00:39:03,090 If it's behaving in wild ways, the more examples you give it, the better. 854 00:39:03,090 --> 00:39:05,635 That's not going to solve the domain of all of physics. 855 00:39:05,635 --> 00:39:07,760 So for the domain of all of physics, I'm going to-- 856 00:39:07,760 --> 00:39:09,140 I'm going to bail and give it to you because I 857 00:39:09,140 --> 00:39:11,515 think you are far more equipped than me to speak on that. 858 00:39:11,515 --> 00:39:14,730 SIL HAMILTON: Sure, so the model doesn't have a ground truth. 859 00:39:14,730 --> 00:39:16,010 It doesn't know anything. 860 00:39:16,010 --> 00:39:19,010 Any sense of meaning that is derived from the training process 861 00:39:19,010 --> 00:39:21,920 is purely out of differentiation. 862 00:39:21,920 --> 00:39:23,430 One word is not another word. 863 00:39:23,430 --> 00:39:27,905 Words are not used in the same context it understands everything 864 00:39:27,905 --> 00:39:29,780 only through examples given through language. 865 00:39:29,780 --> 00:39:32,570 It's like someone who learned English or how to speak, 866 00:39:32,570 --> 00:39:34,670 but they grew up in a featureless, gray room. 867 00:39:34,670 --> 00:39:36,360 They've never seen the outside world. 868 00:39:36,360 --> 00:39:39,152 They have nothing to rest on that tells them that something is true 869 00:39:39,152 --> 00:39:40,800 and something is not true. 870 00:39:40,800 --> 00:39:43,850 So from the model's perspective, everything that it says is true. 871 00:39:43,850 --> 00:39:46,460 It's trying its best to give you the best answer possible. 872 00:39:46,460 --> 00:39:50,810 And if lying a little bit or conflating two different topics 873 00:39:50,810 --> 00:39:53,587 is the best way to achieve that, then it will decide to do so. 874 00:39:53,587 --> 00:39:54,920 It's a part of the architecture. 875 00:39:54,920 --> 00:39:56,120 We can't get around it. 876 00:39:56,120 --> 00:39:59,690 There are a number of cheap tricks that surprisingly get 877 00:39:59,690 --> 00:40:01,868 it to confabulate or hallucinate less. 878 00:40:01,868 --> 00:40:04,910 One of them includes-- recently, there was a paper that's a little funny. 879 00:40:04,910 --> 00:40:09,800 If you get it to prepend to its answer, "My best guess is," 880 00:40:09,800 --> 00:40:14,400 that will actually improve or reduce hallucinations by about 80%. 881 00:40:14,400 --> 00:40:16,730 So clearly, it has some sense that some things are true 882 00:40:16,730 --> 00:40:19,397 and other things are not, but we're not quite sure what that is. 883 00:40:19,397 --> 00:40:22,070 To add on to what Ted was saying, a few cheap things you can do, 884 00:40:22,070 --> 00:40:26,060 include letting it Google, or Bing-- as in Bing Chat, what they're 885 00:40:26,060 --> 00:40:28,250 doing-- it cites this information. 886 00:40:28,250 --> 00:40:31,430 Asking it to make sure its own response is good. 887 00:40:31,430 --> 00:40:34,700 If you've ever had ChatGPT generate a program-- 888 00:40:34,700 --> 00:40:37,130 there's some kind of problem, and you ask ChatGPT-- 889 00:40:37,130 --> 00:40:38,390 I think there's a mistake. 890 00:40:38,390 --> 00:40:40,820 Often, it'll locate the mistake itself. 891 00:40:40,820 --> 00:40:43,478 Why it didn't produce the right answer at the very beginning, 892 00:40:43,478 --> 00:40:45,770 we're still not sure, but we're moving in the direction 893 00:40:45,770 --> 00:40:46,895 of reducing hallucinations. 894 00:40:46,895 --> 00:40:49,130 Now with respect to physics, you're going 895 00:40:49,130 --> 00:40:53,810 to have to give it an external database to rest on because internally 896 00:40:53,810 --> 00:40:56,870 for really, really domain-specific knowledge, 897 00:40:56,870 --> 00:41:02,330 it's not going to be as deterministic as one would like. 898 00:41:02,330 --> 00:41:04,130 These things work in continuous spaces. 899 00:41:04,130 --> 00:41:07,650 These things, they don't know what is wrong, what is true. 900 00:41:07,650 --> 00:41:10,310 And as a result, we have to give it tools. 901 00:41:10,310 --> 00:41:14,427 So everything that Ted demoed today is really 902 00:41:14,427 --> 00:41:17,260 striving at reducing hallucinations, actually, really, and giving it 903 00:41:17,260 --> 00:41:18,160 more abilities. 904 00:41:18,160 --> 00:41:20,415 I hope that answers your question. 905 00:41:20,415 --> 00:41:21,790 TED BENSON: One of the ways too-- 906 00:41:21,790 --> 00:41:22,960 I'm a simple guy. 907 00:41:22,960 --> 00:41:26,680 I tend to think that all of the world tends to be just a few things 908 00:41:26,680 --> 00:41:28,000 repeated over and over again. 909 00:41:28,000 --> 00:41:29,710 And we have human systems for this. 910 00:41:29,710 --> 00:41:33,370 In a team, like companies work-- or a team playing sport, 911 00:41:33,370 --> 00:41:36,020 and we're not right all the time, even when we aspire to be. 912 00:41:36,020 --> 00:41:38,980 And so we have systems that we've developed as humans 913 00:41:38,980 --> 00:41:41,050 to deal with things that may be wrong. 914 00:41:41,050 --> 00:41:43,810 So human number one proposes an answer. 915 00:41:43,810 --> 00:41:45,700 Human number two checks their work. 916 00:41:45,700 --> 00:41:48,430 Human number three provides the final sign off. 917 00:41:48,430 --> 00:41:49,360 This is really common. 918 00:41:49,360 --> 00:41:51,860 Anybody who's worked in a company has seen this in practice. 919 00:41:51,860 --> 00:41:55,210 The interesting thing about the state of software right now, 920 00:41:55,210 --> 00:41:57,520 we tend to be in this mode in which we're just 921 00:41:57,520 --> 00:41:59,860 talking to GPT as one entity. 922 00:41:59,860 --> 00:42:03,470 But once we start thinking in terms of teams, so to speak, 923 00:42:03,470 --> 00:42:06,760 where each team member is its own agent with its own set of objectives 924 00:42:06,760 --> 00:42:09,430 and skills, I suspect we're going to start 925 00:42:09,430 --> 00:42:11,860 seeing a programming model in which the way to solve this 926 00:42:11,860 --> 00:42:17,380 might not necessarily be make a single brain smarter, but instead be draw 927 00:42:17,380 --> 00:42:20,770 upon the collective intelligence of multiple software agents, 928 00:42:20,770 --> 00:42:21,980 each playing a role. 929 00:42:21,980 --> 00:42:25,750 And I think that that would certainly follow the human pattern of how 930 00:42:25,750 --> 00:42:26,530 we deal with this. 931 00:42:26,530 --> 00:42:29,140 SIL HAMILTON: To give it an analogy, space shuttles, 932 00:42:29,140 --> 00:42:33,087 things that go into space, spacecraft, they have to be good. 933 00:42:33,087 --> 00:42:34,420 If they're not good, people die. 934 00:42:34,420 --> 00:42:38,000 They have no margin for error at all. 935 00:42:38,000 --> 00:42:40,480 And as a result, we overengineer in those systems. 936 00:42:40,480 --> 00:42:42,640 Most spacecraft have three computers. 937 00:42:42,640 --> 00:42:46,690 And they all have to agree in unison on a particular step to go forward. 938 00:42:46,690 --> 00:42:49,910 If one does not agree, then they recalculate, they recalculate, 939 00:42:49,910 --> 00:42:51,910 they recalculate until they arrive at something. 940 00:42:51,910 --> 00:42:54,850 But the good thing is that hallucinations are generally not 941 00:42:54,850 --> 00:42:57,220 a systemic problem in terms of its knowledge. 942 00:42:57,220 --> 00:42:58,720 It's often a one-off. 943 00:42:58,720 --> 00:43:02,025 The model or something tripped it up, and it just produced a hallucination 944 00:43:02,025 --> 00:43:02,900 in that one instance. 945 00:43:02,900 --> 00:43:06,070 So if there's three models working in unison, just as Ted is saying, 946 00:43:06,070 --> 00:43:10,850 that will, generally speaking, improve your success. 947 00:43:10,850 --> 00:43:11,665 Yes, sir. 948 00:43:11,665 --> 00:43:14,290 AUDIENCE: A number of the examples you show how assertions like 949 00:43:14,290 --> 00:43:16,450 you are an engineer, you aren't AI. 950 00:43:16,450 --> 00:43:17,242 You are a teacher. 951 00:43:17,242 --> 00:43:17,950 TED BENSON: Yeah. 952 00:43:17,950 --> 00:43:20,800 AUDIENCE: What's the mechanism by which that influences 953 00:43:20,800 --> 00:43:22,952 this computation of probabilities? 954 00:43:22,952 --> 00:43:23,660 TED BENSON: Sure. 955 00:43:23,660 --> 00:43:26,830 I'm going to give you what might be an unsatisfying answer, which 956 00:43:26,830 --> 00:43:28,060 is it tends to work. 957 00:43:28,060 --> 00:43:30,160 But I think we know why it tends to work. 958 00:43:30,160 --> 00:43:32,170 And again, it's because these language models 959 00:43:32,170 --> 00:43:34,100 approximate how we talk to each other. 960 00:43:34,100 --> 00:43:36,400 So if I were to say to you-- hey, help me out. 961 00:43:36,400 --> 00:43:38,865 I need you to mock interview me. 962 00:43:38,865 --> 00:43:40,990 That's a direct statement I can make that kicks you 963 00:43:40,990 --> 00:43:42,520 into a certain mode of interaction. 964 00:43:42,520 --> 00:43:44,830 Or if I say to you-- help me out. 965 00:43:44,830 --> 00:43:46,720 I'm trying to apologize to my wife. 966 00:43:46,720 --> 00:43:47,800 She's really mad at me. 967 00:43:47,800 --> 00:43:49,100 Can you role play with me? 968 00:43:49,100 --> 00:43:51,100 That kicks you into another mode of interaction. 969 00:43:51,100 --> 00:43:53,590 And so it's really just a shorthand that people 970 00:43:53,590 --> 00:43:55,990 have found to kick the agent in-- to kick 971 00:43:55,990 --> 00:43:58,600 the LLM in to a certain mode of interaction that 972 00:43:58,600 --> 00:44:01,450 tends to work in the way that I, as a software developer, 973 00:44:01,450 --> 00:44:03,510 am hoping it would work. 974 00:44:03,510 --> 00:44:06,270 SIL HAMILTON: And to really quickly add on to that. 975 00:44:06,270 --> 00:44:08,958 Being in the digital humanities that I am, 976 00:44:08,958 --> 00:44:10,500 I like to think of it as a narrative. 977 00:44:10,500 --> 00:44:13,470 A narrative will have a few different characters talking to each other. 978 00:44:13,470 --> 00:44:15,150 Their roles are clearly defined. 979 00:44:15,150 --> 00:44:17,620 Two people are not the same. 980 00:44:17,620 --> 00:44:20,340 This interaction with GPT, it assumes a personality. 981 00:44:20,340 --> 00:44:21,960 It can simulate personalities. 982 00:44:21,960 --> 00:44:26,080 It, itself, is not conscious in any way, but it can certainly 983 00:44:26,080 --> 00:44:29,660 predict what a conscious being would react like in a particular situation. 984 00:44:29,660 --> 00:44:34,150 So when we're going URX, it is drawing up that personality 985 00:44:34,150 --> 00:44:35,920 and talking as though it is that person. 986 00:44:35,920 --> 00:44:38,260 Because it is like completing a transcript 987 00:44:38,260 --> 00:44:42,610 or completing a story in which that character is present, and interacting, 988 00:44:42,610 --> 00:44:44,530 and is active. 989 00:44:44,530 --> 00:44:45,473 So, yeah. 990 00:44:45,473 --> 00:44:48,390 TED BENSON: I think we got about five minutes until the pizza outside. 991 00:44:48,390 --> 00:44:49,080 SPEAKER 1: Eight minutes. 992 00:44:49,080 --> 00:44:50,163 TED BENSON: Eight minutes. 993 00:44:50,163 --> 00:44:53,080 994 00:44:53,080 --> 00:44:55,150 Yes, sir. 995 00:44:55,150 --> 00:44:59,680 AUDIENCE: So I'm not a VS person, but it's been fun playing with this. 996 00:44:59,680 --> 00:45:04,370 And I understand the word by word generation and the vibe. 997 00:45:04,370 --> 00:45:07,385 The feeling of it in the narrative. 998 00:45:07,385 --> 00:45:09,260 Some of my friends and I have tried giving it 999 00:45:09,260 --> 00:45:15,040 logic problems, like things from the LSAT, for example, and it doesn't work. 1000 00:45:15,040 --> 00:45:16,870 And I'm just wondering why that would be. 1001 00:45:16,870 --> 00:45:20,710 So it will generate answers that sound very plausible 1002 00:45:20,710 --> 00:45:24,340 rhetorically-- like given this condition, x given this will be y, 1003 00:45:24,340 --> 00:45:28,420 but it will often even contradict itself in its answers. 1004 00:45:28,420 --> 00:45:30,890 But it's almost never correct. 1005 00:45:30,890 --> 00:45:33,850 So I was wondering why that would be? 1006 00:45:33,850 --> 00:45:35,820 Like, it just can't reason? 1007 00:45:35,820 --> 00:45:37,820 It can't think? 1008 00:45:37,820 --> 00:45:41,448 And can you-- would we get to a place where it can, so to speak? 1009 00:45:41,448 --> 00:45:43,990 I mean not-- you know what I mean, I don't mean to think like 1010 00:45:43,990 --> 00:45:46,160 it's conscious, I mean have thoughts-- 1011 00:45:46,160 --> 00:45:46,660 [INAUDIBLE]? 1012 00:45:46,660 --> 00:45:47,440 TED BENSON: You want to about react? 1013 00:45:47,440 --> 00:45:49,273 AUDIENCE: I don't know how else to say that. 1014 00:45:49,273 --> 00:45:50,500 TED BENSON: So GPT4-- 1015 00:45:50,500 --> 00:45:55,300 when GPT4 released back in March, I think it was, it was passing LSAT. 1016 00:45:55,300 --> 00:45:56,020 AUDIENCE: It was? 1017 00:45:56,020 --> 00:45:56,860 TED BENSON: It was, yeah. 1018 00:45:56,860 --> 00:45:57,310 AUDIENCE: [INAUDIBLE] 1019 00:45:57,310 --> 00:45:57,730 TED BENSON: Yes. 1020 00:45:57,730 --> 00:45:58,330 AUDIENCE: [INAUDIBLE] 1021 00:45:58,330 --> 00:46:00,370 TED BENSON: Yes, it just passed, as I understand it. 1022 00:46:00,370 --> 00:46:02,570 AUDIENCE: Maybe it's because we're not using [INAUDIBLE].. 1023 00:46:02,570 --> 00:46:04,300 TED BENSON: That's one of the weird things, is that-- 1024 00:46:04,300 --> 00:46:04,990 AUDIENCE: ChatGPT. 1025 00:46:04,990 --> 00:46:05,740 TED BENSON: Yeah-- 1026 00:46:05,740 --> 00:46:06,730 AUDIENCE: [INAUDIBLE] 1027 00:46:06,730 --> 00:46:09,938 TED BENSON: If you pay for ChatGPT, they give you access to the better model. 1028 00:46:09,938 --> 00:46:13,660 And one of the interesting things with it is prompting. 1029 00:46:13,660 --> 00:46:15,340 It's so finicky. 1030 00:46:15,340 --> 00:46:18,310 It's very sensitive to the way that you prompt. 1031 00:46:18,310 --> 00:46:21,910 Earlier on when, when GPT3 came out, some people were going, look, 1032 00:46:21,910 --> 00:46:25,210 it can pass literacy tests, or no, it can't pass literacy tests. 1033 00:46:25,210 --> 00:46:28,060 And then people who were pro or anti GPT would be like, 1034 00:46:28,060 --> 00:46:31,300 I modified the prompt a little bit, suddenly it can or suddenly it can't. 1035 00:46:31,300 --> 00:46:33,430 These things are not conscious. 1036 00:46:33,430 --> 00:46:35,950 Their ability to reason is like an alien's. 1037 00:46:35,950 --> 00:46:36,602 They're not us. 1038 00:46:36,602 --> 00:46:37,810 They don't think like people. 1039 00:46:37,810 --> 00:46:38,800 They're not human. 1040 00:46:38,800 --> 00:46:43,210 But they certainly are capable of passing some things empirically, which 1041 00:46:43,210 --> 00:46:46,540 demonstrates some sort of rationale or logic within the model. 1042 00:46:46,540 --> 00:46:49,750 But we're still slowly figuring out, like a prompt whisperer, 1043 00:46:49,750 --> 00:46:51,340 what exactly the right approach is. 1044 00:46:51,340 --> 00:46:56,170 1045 00:46:56,170 --> 00:47:01,900 AUDIENCE: Obviously, having GPT running and prompting it continuously 1046 00:47:01,900 --> 00:47:04,670 is very expensive in terms of GPU. 1047 00:47:04,670 --> 00:47:10,726 How do you see instances where it pre-creates some sort of business value 1048 00:47:10,726 --> 00:47:12,470 in a startup or a company? 1049 00:47:12,470 --> 00:47:18,184 Was there a real added value having these little AI apps 1050 00:47:18,184 --> 00:47:22,540 in terms of [INAUDIBLE]? 1051 00:47:22,540 --> 00:47:24,960 TED BENSON: Yeah, we host companies on top of us, 1052 00:47:24,960 --> 00:47:27,400 who that's their primary product. 1053 00:47:27,400 --> 00:47:31,480 The value that it adds is, like any company-- 1054 00:47:31,480 --> 00:47:33,422 I mean, it's what is the Y Combinator motto-- 1055 00:47:33,422 --> 00:47:34,630 "Make something people want." 1056 00:47:34,630 --> 00:47:39,430 I mean, I wouldn't think of this as GPT inherently provides value for you 1057 00:47:39,430 --> 00:47:40,205 as a builder. 1058 00:47:40,205 --> 00:47:41,080 That's their product. 1059 00:47:41,080 --> 00:47:42,160 That's OpenAI's product. 1060 00:47:42,160 --> 00:47:44,920 You pay ChatGPT for prioritized access. 1061 00:47:44,920 --> 00:47:48,670 Where your product might be is how you take that and combine it 1062 00:47:48,670 --> 00:47:53,410 with your data, somebody else's data, some domain knowledge, some interface 1063 00:47:53,410 --> 00:47:56,740 that then helps apply it to something. 1064 00:47:56,740 --> 00:47:58,100 Two things are both true. 1065 00:47:58,100 --> 00:48:00,910 There are a lot of experiments going on right now, 1066 00:48:00,910 --> 00:48:05,500 both for fun and people trying to figure out where the economic value is. 1067 00:48:05,500 --> 00:48:08,440 But folks are also spinning up companies that are 100% supported 1068 00:48:08,440 --> 00:48:10,063 by applying this to data. 1069 00:48:10,063 --> 00:48:10,605 AUDIENCE: OK. 1070 00:48:10,605 --> 00:48:14,150 For a company that wouldn't have-- 1071 00:48:14,150 --> 00:48:19,450 wouldn't be AI focused [INAUDIBLE],, just using or developing 1072 00:48:19,450 --> 00:48:24,770 in-house apps that use GPT for productivity. 1073 00:48:24,770 --> 00:48:29,990 TED BENSON: I think that it is likely that today we call this GPT, 1074 00:48:29,990 --> 00:48:32,000 and today we call these LLMs, and tomorrow it 1075 00:48:32,000 --> 00:48:33,485 will just slide into the ether. 1076 00:48:33,485 --> 00:48:36,110 Imagine what the-- imagine what the progression is going to be. 1077 00:48:36,110 --> 00:48:39,250 Today, there's one of these that people are primarily playing with. 1078 00:48:39,250 --> 00:48:42,500 There's many of them that exist, but one people are primarily building on top. 1079 00:48:42,500 --> 00:48:45,440 Tomorrow, we can expect that there will be many of them. 1080 00:48:45,440 --> 00:48:48,440 And the day after that, we can expect they're going to be on our phones, 1081 00:48:48,440 --> 00:48:50,898 and they're not even going to be connected to the internet. 1082 00:48:50,898 --> 00:48:54,380 And for that reason, I think that-- like today, 1083 00:48:54,380 --> 00:48:57,620 we don't call our software microprocessor tools or microprocessor 1084 00:48:57,620 --> 00:48:59,960 apps, like the processor just exists-- 1085 00:48:59,960 --> 00:49:03,720 I think that one useful model, five years out, 1086 00:49:03,720 --> 00:49:07,730 10 years out is to-- even if it's only metaphorically true and not literally 1087 00:49:07,730 --> 00:49:08,300 true-- 1088 00:49:08,300 --> 00:49:12,110 I think it's useful to think if this as a second processor. 1089 00:49:12,110 --> 00:49:15,710 We had this before with floating point co-processors and graphics 1090 00:49:15,710 --> 00:49:19,400 co-processors already, as recently as the '90s, where 1091 00:49:19,400 --> 00:49:22,640 it's useful to think of the trajectory of this as just another thing 1092 00:49:22,640 --> 00:49:25,670 that computers do, can do, and it will be incorporated 1093 00:49:25,670 --> 00:49:27,050 into absolutely everything. 1094 00:49:27,050 --> 00:49:31,310 SIL HAMILTON: Hence, the term foundation model, which also crops up. 1095 00:49:31,310 --> 00:49:33,890 So the pizza's ready? 1096 00:49:33,890 --> 00:49:34,937 One more question. 1097 00:49:34,937 --> 00:49:37,520 TED BENSON: Maybe one more and then we'll break for some food. 1098 00:49:37,520 --> 00:49:40,530 1099 00:49:40,530 --> 00:49:41,820 In the glasses right there. 1100 00:49:41,820 --> 00:49:46,550 AUDIENCE: Do you have recommendations for the-- 1101 00:49:46,550 --> 00:49:49,968 TED BENSON: Sorry, I was just being told we need to get two more. 1102 00:49:49,968 --> 00:49:52,010 AUDIENCE: Do you have any recommendations for how 1103 00:49:52,010 --> 00:49:56,418 ChatGPT will [INAUDIBLE] structure of data, like JSON, for example, 1104 00:49:56,418 --> 00:49:57,903 [INAUDIBLE]. 1105 00:49:57,903 --> 00:50:00,070 TED BENSON: It's hard to get it to do that reliably. 1106 00:50:00,070 --> 00:50:02,380 It's incredibly useful to get it to do reliably. 1107 00:50:02,380 --> 00:50:05,680 So some tricks you can use are you can give it examples. 1108 00:50:05,680 --> 00:50:08,470 You can just ask it directly. 1109 00:50:08,470 --> 00:50:10,630 Those are two common tricks. 1110 00:50:10,630 --> 00:50:13,390 And look at the prompts that others have used to work. 1111 00:50:13,390 --> 00:50:16,540 There's a lot of art to finding the right prompt right now. 1112 00:50:16,540 --> 00:50:19,210 A lot of it is magic incantation. 1113 00:50:19,210 --> 00:50:23,800 Another thing you can do is post process it so that you can do some checking, 1114 00:50:23,800 --> 00:50:26,233 and you can have a happy path, in which it's a one shot, 1115 00:50:26,233 --> 00:50:28,900 and you get your answer, and then a sad path, in which maybe you 1116 00:50:28,900 --> 00:50:30,310 fall back on other prompts. 1117 00:50:30,310 --> 00:50:32,440 So then you're going for the diversity of approach, 1118 00:50:32,440 --> 00:50:36,160 where it's fast by default, it's slow, but ultimately 1119 00:50:36,160 --> 00:50:39,140 converging upon higher likelihood of success if it fails. 1120 00:50:39,140 --> 00:50:42,370 And then something that I'm sure we'll see and people do later on 1121 00:50:42,370 --> 00:50:45,310 is fine tune-- instruction tuning style models, 1122 00:50:45,310 --> 00:50:49,920 which are more likely to respond with a computer parsable output. 1123 00:50:49,920 --> 00:50:51,340 I guess one last question. 1124 00:50:51,340 --> 00:50:54,620 AUDIENCE: Sure, so one, you talked-- a couple of things. 1125 00:50:54,620 --> 00:50:58,176 One is this you talk about domain expertise here. 1126 00:50:58,176 --> 00:51:01,605 And you're encoding a bunch of domain expertise in terms of the prompts 1127 00:51:01,605 --> 00:51:02,940 that you're loading there. 1128 00:51:02,940 --> 00:51:05,110 What is that-- where do those prompts end up? 1129 00:51:05,110 --> 00:51:08,480 Do those prompts end up back in the ChatGPT model? 1130 00:51:08,480 --> 00:51:11,050 And is there a privacy issue associated with that? 1131 00:51:11,050 --> 00:51:12,550 TED BENSON: That's a great question. 1132 00:51:12,550 --> 00:51:13,860 So the question was-- and I apologize, I just 1133 00:51:13,860 --> 00:51:16,350 realized we haven't been repeating all the questions for the YouTube 1134 00:51:16,350 --> 00:51:18,090 listeners, so I'm sorry for the folks on YouTube 1135 00:51:18,090 --> 00:51:20,230 if you weren't able to hear some of the questions. 1136 00:51:20,230 --> 00:51:23,438 The question was, what are the privacy implications of some of these prompts? 1137 00:51:23,438 --> 00:51:26,040 If one of the messages is so much depends upon your prompt 1138 00:51:26,040 --> 00:51:29,760 and the fine tuning of this prompt, what does that mean with respect to my IP? 1139 00:51:29,760 --> 00:51:32,190 Maybe the prompt is my business. 1140 00:51:32,190 --> 00:51:34,410 I can't offer you the exact answer, but I 1141 00:51:34,410 --> 00:51:37,510 can paint for you what approximately the landscape looks like. 1142 00:51:37,510 --> 00:51:41,440 So in all of software, and so too with AI, what we see is there 1143 00:51:41,440 --> 00:51:44,850 are the SAS companies, where you're using somebody else's API. 1144 00:51:44,850 --> 00:51:48,270 And you're trusting that their terms of service will be upheld. 1145 00:51:48,270 --> 00:51:52,200 There's the set of companies in which they provide a model for hosting 1146 00:51:52,200 --> 00:51:54,010 on one of the big cloud providers. 1147 00:51:54,010 --> 00:51:56,243 And this is a version of the same thing, but I think 1148 00:51:56,243 --> 00:51:57,660 with slightly different mechanics. 1149 00:51:57,660 --> 00:52:00,410 This tends to be thought of as the enterprise version of software. 1150 00:52:00,410 --> 00:52:03,240 And by and large, the industry has moved over the past 20 years 1151 00:52:03,240 --> 00:52:06,480 from running my own servers to trusting that Microsoft, or Amazon, or Google 1152 00:52:06,480 --> 00:52:07,805 can run servers for me. 1153 00:52:07,805 --> 00:52:10,930 And they say it's my private server, even though I know they're running it. 1154 00:52:10,930 --> 00:52:12,030 And I'm OK with that. 1155 00:52:12,030 --> 00:52:14,160 And you've already started to see that-- 1156 00:52:14,160 --> 00:52:16,620 Amazon with Hugging Face, Microsoft with OpenAI , 1157 00:52:16,620 --> 00:52:19,770 Google 2 with their own version of Bard, are going to do these, 1158 00:52:19,770 --> 00:52:23,280 where you'll have the SAS version, and then you'll also have the private VPC 1159 00:52:23,280 --> 00:52:23,790 version. 1160 00:52:23,790 --> 00:52:26,998 And then there's a third version that I think we haven't yet seen practically 1161 00:52:26,998 --> 00:52:29,070 emerge, but this would be the maximalist, 1162 00:52:29,070 --> 00:52:32,430 I want to make sure my IP is maximally safe version of events, 1163 00:52:32,430 --> 00:52:34,530 in which you are running your own machines, 1164 00:52:34,530 --> 00:52:36,180 you are running your own models. 1165 00:52:36,180 --> 00:52:38,760 And then the question is, is the open source 1166 00:52:38,760 --> 00:52:41,700 and/or privately available version of the model as good 1167 00:52:41,700 --> 00:52:43,320 as the publicly hosted one? 1168 00:52:43,320 --> 00:52:44,580 And does that matter to me? 1169 00:52:44,580 --> 00:52:46,560 And the answer is, right now, realistically, it 1170 00:52:46,560 --> 00:52:48,030 probably matters a lot. 1171 00:52:48,030 --> 00:52:51,630 In the fullness of time you can think of any one particular task 1172 00:52:51,630 --> 00:52:55,840 you need to achieve as requiring some fixed point of intelligence to achieve. 1173 00:52:55,840 --> 00:52:59,370 And so over time, what we'll see is the privately obtainable versions 1174 00:52:59,370 --> 00:53:01,320 of these models will cross that threshold. 1175 00:53:01,320 --> 00:53:05,250 And with respect to that one task, yeah, sure, use the open source version, 1176 00:53:05,250 --> 00:53:06,490 run it on your own machine. 1177 00:53:06,490 --> 00:53:09,570 But we'll also see the SAS intelligence get smarter. 1178 00:53:09,570 --> 00:53:10,803 It'll probably stay ahead. 1179 00:53:10,803 --> 00:53:13,470 And then your question is, well, which one do I care more about? 1180 00:53:13,470 --> 00:53:15,690 Do I want like the better aggregate intelligence 1181 00:53:15,690 --> 00:53:18,540 or is my task somewhat fixed point and I can just 1182 00:53:18,540 --> 00:53:20,460 use the open source available one for which 1183 00:53:20,460 --> 00:53:23,293 I know it'll perform well enough because it's crossed the threshold? 1184 00:53:23,293 --> 00:53:26,580 SIL HAMILTON: So to answer your question specifically, yes. 1185 00:53:26,580 --> 00:53:29,550 You might be glad to know that ChatGPT recently updated their privacy 1186 00:53:29,550 --> 00:53:32,860 policy to not use prompts for the training process. 1187 00:53:32,860 --> 00:53:37,560 But up until now, everything went back into the bin to be trained on again. 1188 00:53:37,560 --> 00:53:39,370 And that's just a fact. 1189 00:53:39,370 --> 00:53:40,680 So I think pizza-- 1190 00:53:40,680 --> 00:53:42,510 it's now pizza time. 1191 00:53:42,510 --> 00:53:43,833 Yay, OK. 1192 00:53:43,833 --> 00:53:44,720 [APPLAUSE] 1193 00:53:44,720 --> 00:53:47,210 [INTERPOSING VOICES] 1194 00:53:47,210 --> 00:53:51,000