1 00:00:00,000 --> 00:00:01,250 DAVID J. MALAN: Hello, world. 2 00:00:01,250 --> 00:00:03,920 This is CS50, and this is an introduction 3 00:00:03,920 --> 00:00:08,660 to how CS50 has been teaching using Artificial Intelligence, or AI, 4 00:00:08,660 --> 00:00:10,070 over the past several months. 5 00:00:10,070 --> 00:00:12,830 My name is David Malan, and I teach this here course called 6 00:00:12,830 --> 00:00:16,850 CS50, which is Harvard's introduction to the intellectual enterprises of computer 7 00:00:16,850 --> 00:00:19,010 science and the art of programming. 8 00:00:19,010 --> 00:00:21,725 If you would like to follow along with these same slides 9 00:00:21,725 --> 00:00:23,600 that you'll see over my shoulder, please feel 10 00:00:23,600 --> 00:00:26,240 free to take a photo or screenshot or pause 11 00:00:26,240 --> 00:00:29,030 at this point to use this QR code, which will lead you 12 00:00:29,030 --> 00:00:32,150 to a copy of today's slides. 13 00:00:32,150 --> 00:00:35,900 Now, CS50 is among Harvard's largest classes here on campus. 14 00:00:35,900 --> 00:00:39,740 We have some 600 undergraduates over the course of the fall, the spring, 15 00:00:39,740 --> 00:00:41,810 and the summer semester nowadays. 16 00:00:41,810 --> 00:00:45,140 We also have some 200 students each year through Harvard's Extension School, 17 00:00:45,140 --> 00:00:46,890 which is our continuing education program. 18 00:00:46,890 --> 00:00:49,807 And for the past nine years, we've been collaborating with our friends 19 00:00:49,807 --> 00:00:52,340 down the road at Yale University in New Haven, Connecticut, 20 00:00:52,340 --> 00:00:55,790 where we have some 250 students taking the course as well. 21 00:00:55,790 --> 00:00:58,280 They primarily watch the course's lectures online, 22 00:00:58,280 --> 00:01:01,740 but with our own teaching assistants in New Haven and faculty 23 00:01:01,740 --> 00:01:06,420 do they have sections or recitations, office hours, CS50 events, and more. 24 00:01:06,420 --> 00:01:10,230 For the past several years, we've also offered variations of the class 25 00:01:10,230 --> 00:01:15,000 through Harvard's Business School and Law School, CS50 for MBAs, so to speak, 26 00:01:15,000 --> 00:01:18,600 CS50 for lawyers, so to speak, which are focused more specifically 27 00:01:18,600 --> 00:01:20,370 on those particular demographics. 28 00:01:20,370 --> 00:01:23,850 And then, of course, as many of you know, the course since 2007 29 00:01:23,850 --> 00:01:26,400 has been freely available as OpenCourseWare, 30 00:01:26,400 --> 00:01:30,900 which is to say anyone is free to take or to teach the class, even 31 00:01:30,900 --> 00:01:34,800 using the freely available materials as well as technology. 32 00:01:34,800 --> 00:01:36,690 And indeed through platforms like YouTube 33 00:01:36,690 --> 00:01:39,180 do we have some 1.8 million subscribers. 34 00:01:39,180 --> 00:01:43,830 Through platforms like EdX do we have some 5.7 million registrants now 35 00:01:43,830 --> 00:01:44,550 to date. 36 00:01:44,550 --> 00:01:47,130 Now, CS50 itself, as many of you know, is 37 00:01:47,130 --> 00:01:49,470 quite substantive in terms of how much content 38 00:01:49,470 --> 00:01:51,240 and how much technology it covers. 39 00:01:51,240 --> 00:01:53,730 We begin the semester with a language called Scratch, which 40 00:01:53,730 --> 00:01:55,440 is a very friendly graphical language. 41 00:01:55,440 --> 00:01:57,930 We transition thereafter to roughly half the class 42 00:01:57,930 --> 00:02:00,722 in C, a more traditional text-based language. 43 00:02:00,722 --> 00:02:02,430 And then in the latter half of the course 44 00:02:02,430 --> 00:02:05,880 do we transition to languages like Python and SQL 45 00:02:05,880 --> 00:02:10,410 and JavaScript in the context of HTML and CSS so that at the end of the course 46 00:02:10,410 --> 00:02:14,010 students are not only prepared for further studies in computer science 47 00:02:14,010 --> 00:02:17,970 foundationally, but also, if they never again take another computer science 48 00:02:17,970 --> 00:02:19,890 course, we hope that they're well equipped 49 00:02:19,890 --> 00:02:22,320 to go back to their own fields of interest, 50 00:02:22,320 --> 00:02:25,740 be it in the arts, humanities, social sciences, natural sciences, 51 00:02:25,740 --> 00:02:30,000 physical sciences, or beyond, to apply lessons learned from computer science 52 00:02:30,000 --> 00:02:33,270 and programming to problems of interest in their own domain. 53 00:02:33,270 --> 00:02:35,940 But with this many students and with this many teachers, 54 00:02:35,940 --> 00:02:38,970 suffice it to say it's been a challenge and a goal to provide as much 55 00:02:38,970 --> 00:02:43,987 of a support structure as we can, not only for CS50 itself, which many of you 56 00:02:43,987 --> 00:02:48,060 know by its OpenCourseWare name, CS50x, but over the past few years 57 00:02:48,060 --> 00:02:51,510 has CS50 really evolved into a whole ecosystem of courses, 58 00:02:51,510 --> 00:02:55,140 an entire curriculum, particularly focused on introductions 59 00:02:55,140 --> 00:02:57,030 to various technologies and tools. 60 00:02:57,030 --> 00:03:00,990 But here are some of our screenshots of some of CS50's most recent courses, 61 00:03:00,990 --> 00:03:03,240 all of which, too, are freely available. 62 00:03:03,240 --> 00:03:07,920 In fact, if you're new to the community, feel free to go to edx.org/cs50, 63 00:03:07,920 --> 00:03:10,890 where all those courses and eventually more await. 64 00:03:10,890 --> 00:03:13,950 The course we'll focus most of our attention on today in this talk 65 00:03:13,950 --> 00:03:18,880 is CS50 itself, otherwise known as CS50x, which is available at this URL 66 00:03:18,880 --> 00:03:19,380 here. 67 00:03:19,380 --> 00:03:22,950 And for students and teachers in K-12, so to speak, 68 00:03:22,950 --> 00:03:24,990 middle schools, high schools, and so forth 69 00:03:24,990 --> 00:03:28,200 do we also offer a version of the class called CS50 AP, 70 00:03:28,200 --> 00:03:31,920 which high school and middle school teachers are welcome to adopt or adapt 71 00:03:31,920 --> 00:03:34,410 whether or not participating in an advanced placement 72 00:03:34,410 --> 00:03:37,920 program for their own classes and students around the world. 73 00:03:37,920 --> 00:03:40,710 In fact, here's one of our favorite photos from a few years 74 00:03:40,710 --> 00:03:44,460 ago in New York City, where we brought together some three public and two 75 00:03:44,460 --> 00:03:48,000 private schools to have a CS50 AP hackathon, where 76 00:03:48,000 --> 00:03:51,540 students worked on their homework assignments or final projects. 77 00:03:51,540 --> 00:03:55,560 Pictured here is a visitation we had just a few weeks ago in Jakarta, 78 00:03:55,560 --> 00:03:59,790 Indonesia, where we worked with nearly 300 middle school and high school 79 00:03:59,790 --> 00:04:03,390 teachers there who spent the past six months taking CS50, 80 00:04:03,390 --> 00:04:07,470 perhaps like you, online, then to culminate in a professional development 81 00:04:07,470 --> 00:04:10,915 workshop where CS50's team went to Jakarta to meet all of these teachers, 82 00:04:10,915 --> 00:04:14,040 and in the coming weeks and months will they return to their own classrooms 83 00:04:14,040 --> 00:04:16,350 to teach CS50 in some form itself. 84 00:04:16,350 --> 00:04:19,470 And pictured here, lastly, are some of our students in Nicaragua holding 85 00:04:19,470 --> 00:04:23,910 proudly some of their CS50 certificates and Harvard pennants, which 86 00:04:23,910 --> 00:04:28,050 is a community too that we've gotten to know all too well additionally. 87 00:04:28,050 --> 00:04:30,780 So indeed, if you'd like to join those and more communities, 88 00:04:30,780 --> 00:04:33,090 please feel free to reach out after today 89 00:04:33,090 --> 00:04:37,920 at any time via outreach@cs50.harvard.edu. 90 00:04:37,920 --> 00:04:40,830 So the premise for today's talk and really 91 00:04:40,830 --> 00:04:43,950 the motivation behind a lot of CS50's work over the past year 92 00:04:43,950 --> 00:04:47,790 has been this premise that ChatGPT and tools like it, 93 00:04:47,790 --> 00:04:51,660 Bing Chat, GitHub Copilot, and the like, are just too helpful. 94 00:04:51,660 --> 00:04:54,810 They are all too willing and all too deliberately designed 95 00:04:54,810 --> 00:04:57,060 to try to answer your questions outright. 96 00:04:57,060 --> 00:05:01,140 Now, in terms of the real world and in terms of solving real-world problems 97 00:05:01,140 --> 00:05:04,590 and producing code, for instance, in industry, that's a great thing. 98 00:05:04,590 --> 00:05:06,990 I think we've seen evidence already between tools 99 00:05:06,990 --> 00:05:12,090 like GitHub Copilot and Bing and ChatGPT that there is potentially and invariably 100 00:05:12,090 --> 00:05:14,280 is going to be even more of a productivity boost. 101 00:05:14,280 --> 00:05:19,140 It will save humans time, and it will amplify the impact of individual humans, 102 00:05:19,140 --> 00:05:21,720 much more so, for instance, than replacing them outright. 103 00:05:21,720 --> 00:05:23,310 It will amplify one's impact. 104 00:05:23,310 --> 00:05:27,660 But in the context of education, as most of you might know or remember, 105 00:05:27,660 --> 00:05:31,080 it might be nice to have someone hand you the answers to all of the questions 106 00:05:31,080 --> 00:05:33,580 you have on some homework assignment, exam, or the like, 107 00:05:33,580 --> 00:05:35,790 but that's certainly contrary to the whole point of learning 108 00:05:35,790 --> 00:05:36,600 in the first place. 109 00:05:36,600 --> 00:05:41,670 And so our premise and presupposition is that tools like ChatGPT out of the box, 110 00:05:41,670 --> 00:05:45,060 because they're either on or off, there's no settings 111 00:05:45,060 --> 00:05:47,700 really for these tools right now, are all too willing 112 00:05:47,700 --> 00:05:49,860 to just answer any and all of your questions 113 00:05:49,860 --> 00:05:54,360 instead of, like a good teacher or tutor, leading you to an answer. 114 00:05:54,360 --> 00:05:56,610 Rather, they tend to just hand it to you outright. 115 00:05:56,610 --> 00:05:59,550 And so what CS50's team set out to do over the past year 116 00:05:59,550 --> 00:06:03,370 was try to implement our own version of ChatGPT, if you will, 117 00:06:03,370 --> 00:06:07,690 building on top of the shoulders of these other giants, OpenAI, Microsoft, 118 00:06:07,690 --> 00:06:12,190 Google, and others, but imposing some pedagogical guardrails, so to speak, 119 00:06:12,190 --> 00:06:15,490 ironically trying to make these tools less helpful, 120 00:06:15,490 --> 00:06:17,710 putting a little bit of downward pressure 121 00:06:17,710 --> 00:06:21,460 on their default behavior of handing you all of the answers, all of the code 122 00:06:21,460 --> 00:06:22,660 that you might ask for. 123 00:06:22,660 --> 00:06:25,240 What we instead wanted to really do is implement 124 00:06:25,240 --> 00:06:27,430 more of a tutor, a good teacher. 125 00:06:27,430 --> 00:06:30,820 And so, at least with our on-campus students initially 126 00:06:30,820 --> 00:06:34,480 and now all of our online students as well, in CS50 syllabi, 127 00:06:34,480 --> 00:06:37,990 it is not reasonable, that is, it is not allowed 128 00:06:37,990 --> 00:06:42,880 to use right now AI-based software like ChatGPT, GitHub Copilot, Bing 129 00:06:42,880 --> 00:06:45,820 Chat, or the like that suggests or completes answers 130 00:06:45,820 --> 00:06:48,230 to questions or lines of code. 131 00:06:48,230 --> 00:06:52,000 So through policy, we simply disallow usage thereof. 132 00:06:52,000 --> 00:06:53,890 Technologically, there's nothing stopping, 133 00:06:53,890 --> 00:06:56,600 of course, students and teachers alike from using these tools. 134 00:06:56,600 --> 00:07:00,130 So we've tried to walk this line sort of ethically and educationally 135 00:07:00,130 --> 00:07:02,650 within the class to message where these lines are 136 00:07:02,650 --> 00:07:04,390 and which lines should not be crossed. 137 00:07:04,390 --> 00:07:08,590 But rather than just take away this very nascent technology, which is undoubtedly 138 00:07:08,590 --> 00:07:12,010 going to be useful and probably with us here on out, 139 00:07:12,010 --> 00:07:16,090 we do deem it reasonable by contrast for CS50 students 140 00:07:16,090 --> 00:07:20,860 to use CS50's own AI-based software, including the so-called CS50 141 00:07:20,860 --> 00:07:25,600 Duck or Duck Debugger, DDB, which is a riff on GDB, 142 00:07:25,600 --> 00:07:29,680 the GNU Debugger for short, which is now built into two of CS50's tools 143 00:07:29,680 --> 00:07:34,150 called CS50.ai and CS50.dev, which is to say students 144 00:07:34,150 --> 00:07:38,920 are asked not to use off-the-shelf tools like ChatGPT, GitHub Copilot, Bing 145 00:07:38,920 --> 00:07:39,820 Chat, and the like. 146 00:07:39,820 --> 00:07:45,010 But they may and are encouraged to use CS50's own tools that ideally do have 147 00:07:45,010 --> 00:07:47,680 those pedagogical guardrails in place. 148 00:07:47,680 --> 00:07:51,520 So how do we go about implementing this so-called CS50 Duck and why? 149 00:07:51,520 --> 00:07:55,757 Well, in computing circles and in the world of programming, as some of you 150 00:07:55,757 --> 00:07:58,150 know, it's been a thing for quite some time 151 00:07:58,150 --> 00:08:02,620 to use something called rubber duck debugging, or rubberducking for short, 152 00:08:02,620 --> 00:08:07,450 whereby, if you don't have a colleague, a teacher, a friend, a family member who 153 00:08:07,450 --> 00:08:12,040 knows more about programming than you, you should at least keep a rubber 154 00:08:12,040 --> 00:08:15,490 duck on your desk near your laptop or desktop or phone 155 00:08:15,490 --> 00:08:18,430 so that when you do have a question or confusion 156 00:08:18,430 --> 00:08:20,950 or you have some bug or mistake in your code, 157 00:08:20,950 --> 00:08:24,940 you can at least hold up this rubber duck and talk to it, 158 00:08:24,940 --> 00:08:28,600 walking through verbally whatever confusion or problem you're having. 159 00:08:28,600 --> 00:08:31,000 And even though in the real world, the duck really 160 00:08:31,000 --> 00:08:35,350 shouldn't be quacking back at all, squeaking maybe, but not quacking back, 161 00:08:35,350 --> 00:08:40,090 at least in that process of airing your confusion verbally, for so many of us, 162 00:08:40,090 --> 00:08:43,870 myself included, that proverbial light bulb eventually goes off atop your head, 163 00:08:43,870 --> 00:08:46,570 and you realize, oh, that's what I'm doing wrong, 164 00:08:46,570 --> 00:08:50,680 because you realize in verbalizing the problem where you've perhaps 165 00:08:50,680 --> 00:08:51,760 gone astray. 166 00:08:51,760 --> 00:08:55,840 So years ago, we in CS50 actually implemented a virtual version 167 00:08:55,840 --> 00:08:59,140 of this rubber duck because even though we hand the ducks out here on campus 168 00:08:59,140 --> 00:09:02,800 to students residentially, we have so many more students and teachers online. 169 00:09:02,800 --> 00:09:05,920 So we implemented rubber duck debugging or rubberducking 170 00:09:05,920 --> 00:09:09,010 in the context of not only the real world-- 171 00:09:09,010 --> 00:09:13,060 and pictured here is an unnecessarily large eight-foot duck that is often 172 00:09:13,060 --> 00:09:15,610 beside me on stage, in addition to the tiny little rubber 173 00:09:15,610 --> 00:09:19,810 ducks that we do give out-- but we also implemented a virtual version thereof. 174 00:09:19,810 --> 00:09:21,940 Within CS50, for the past few years, we've 175 00:09:21,940 --> 00:09:25,810 used a free open-source tool called Visual Studio Code for Microsoft, 176 00:09:25,810 --> 00:09:29,620 very popular in industry and also usable in educational contexts. 177 00:09:29,620 --> 00:09:33,460 And we implemented what's called an extension, like a plug-in using 178 00:09:33,460 --> 00:09:39,130 some JavaScript code that CS50 wrote to create a virtual chatbot that 179 00:09:39,130 --> 00:09:41,320 takes on the same personality of a rubber duck. 180 00:09:41,320 --> 00:09:43,450 And so, for instance, for the past few years, 181 00:09:43,450 --> 00:09:47,560 if a student were to ask a question like this textually at their keyboard, 182 00:09:47,560 --> 00:09:51,400 "I'm hoping you can help me solve some problem," well, up until recently, 183 00:09:51,400 --> 00:09:55,720 as many of you might recall, all this virtual rubber duck would do would be 184 00:09:55,720 --> 00:09:59,170 to respond with one, two, or three quacks. 185 00:09:59,170 --> 00:10:00,020 And that's it. 186 00:10:00,020 --> 00:10:02,570 And the extent of the complexity here was 187 00:10:02,570 --> 00:10:05,090 that we had a little bit of randomness involved 188 00:10:05,090 --> 00:10:08,510 so that we would programmatically generate one, two, or three quacks. 189 00:10:08,510 --> 00:10:10,550 Now, this is not a scientific measure, but we 190 00:10:10,550 --> 00:10:14,330 have anecdotal evidence that just having the duck quack back 191 00:10:14,330 --> 00:10:19,100 at you textually actually helped solve a nonzero number of problems in the world. 192 00:10:19,100 --> 00:10:22,160 And a nonzero number of students were appreciative that just 193 00:10:22,160 --> 00:10:25,922 by going through the process of typing out their confusion, even 194 00:10:25,922 --> 00:10:28,130 though the duck was only going to quack back at them, 195 00:10:28,130 --> 00:10:32,060 they realized through that process of verbalizing or textualizing 196 00:10:32,060 --> 00:10:34,133 their thoughts where they had gone astray. 197 00:10:34,133 --> 00:10:36,050 And honestly, this actually resonates with me. 198 00:10:36,050 --> 00:10:39,500 On occasion, I've posted on websites like Stack Overflow, 199 00:10:39,500 --> 00:10:41,360 or Stack Exchange more generally, which has 200 00:10:41,360 --> 00:10:44,600 been a very popular tool for questions and answers online. 201 00:10:44,600 --> 00:10:48,500 And I have probably not posted on Stack Overflow 202 00:10:48,500 --> 00:10:51,920 far more frequently than I have posted on Stack Overflow 203 00:10:51,920 --> 00:10:55,850 because I'm so worried about looking like a dummy to everyone on the internet 204 00:10:55,850 --> 00:10:59,790 by asking a question that maybe I shouldn't in retrospect that I carefully 205 00:10:59,790 --> 00:11:02,220 write down all of my thoughts, I try to express 206 00:11:02,220 --> 00:11:05,400 what I have tried, what is not working, what the symptoms are. 207 00:11:05,400 --> 00:11:09,640 And so darn often does that proverbial light bulb go off for me as well. 208 00:11:09,640 --> 00:11:13,730 So if you, too, have ever sat down to write a post or maybe an email or a text 209 00:11:13,730 --> 00:11:16,230 message but not sent it because you realized, wait a minute, 210 00:11:16,230 --> 00:11:20,670 I don't need to, that then is the beauty of something like a rubber duck. 211 00:11:20,670 --> 00:11:24,210 Now, some of our students online just a few months ago 212 00:11:24,210 --> 00:11:28,560 were sort of shocked to discover, literally overnight, 213 00:11:28,560 --> 00:11:31,170 that when they began to ask their questions in English 214 00:11:31,170 --> 00:11:34,260 or some other human language, literally overnight, 215 00:11:34,260 --> 00:11:37,680 a few months ago did the CS50 Duck start responding 216 00:11:37,680 --> 00:11:41,680 to them in English and, in some cases, other human languages as well. 217 00:11:41,680 --> 00:11:43,980 And this is all thanks to the work of CS50's team 218 00:11:43,980 --> 00:11:46,320 over the past several months, and certainly the work 219 00:11:46,320 --> 00:11:49,530 over the past many years of the OpenAIs, Microsofts, Googles, 220 00:11:49,530 --> 00:11:55,500 GitHubs of the world, to make this kind of chatbot, so to speak, now possible. 221 00:11:55,500 --> 00:11:59,100 And so thanks to one of our teaching fellow's daughters in New Zealand 222 00:11:59,100 --> 00:12:04,170 has the CS50 Duck really been brought to life here, not only graphically 223 00:12:04,170 --> 00:12:06,340 but textually as well. 224 00:12:06,340 --> 00:12:09,990 And so pedagogically what our goals have been over these past few months 225 00:12:09,990 --> 00:12:15,450 and moving forward is really to provide with AI students with virtual office 226 00:12:15,450 --> 00:12:17,970 hours 24/7, so to speak. 227 00:12:17,970 --> 00:12:20,970 Office hours, for those unfamiliar, at least in college campuses, 228 00:12:20,970 --> 00:12:24,450 is an opportunity for a student to go meet with a professor or a teaching 229 00:12:24,450 --> 00:12:27,252 assistant or a TA one-on-one or in small groups 230 00:12:27,252 --> 00:12:30,210 to just ask questions about the week's material or homework assignments 231 00:12:30,210 --> 00:12:30,810 or the like. 232 00:12:30,810 --> 00:12:34,420 24/7, of course, is referring to the number of hours in the day and days 233 00:12:34,420 --> 00:12:34,920 in the week. 234 00:12:34,920 --> 00:12:38,430 And the goal then through AI for us has been to provide really 235 00:12:38,430 --> 00:12:42,690 an approximation of a teacher being available to students 236 00:12:42,690 --> 00:12:46,710 throughout the day, throughout the week, to ideally 237 00:12:46,710 --> 00:12:48,900 help get them over certain hurdles. 238 00:12:48,900 --> 00:12:52,080 And we're fortunate, frankly, in places like Harvard and Yale 239 00:12:52,080 --> 00:12:55,230 and college campuses and university campuses more generally, 240 00:12:55,230 --> 00:12:58,950 to have a lot of human resources and lots of teaching assistants, 241 00:12:58,950 --> 00:13:00,840 in some cases, lots of humans. 242 00:13:00,840 --> 00:13:04,410 But so many places around the world and so many students around the world 243 00:13:04,410 --> 00:13:07,080 don't have access to those same resources potentially. 244 00:13:07,080 --> 00:13:08,970 And so the goal here, too, has really been 245 00:13:08,970 --> 00:13:13,260 to try to uplift as much of CS50's community as possible and ideally level 246 00:13:13,260 --> 00:13:17,220 the playing field at least in terms of the support that is available to someone 247 00:13:17,220 --> 00:13:20,370 whether it's on campus or now very much off. 248 00:13:20,370 --> 00:13:22,980 And here is sort of the holy grail for us, so to speak, 249 00:13:22,980 --> 00:13:26,310 whereby the goal technologically and pedagogically for us 250 00:13:26,310 --> 00:13:31,260 has been to ideally approximate a one-to-one teacher-to-student ratio. 251 00:13:31,260 --> 00:13:33,360 Because even in places like here on campus, 252 00:13:33,360 --> 00:13:37,260 we might have 10, 20, 30 students assigned 253 00:13:37,260 --> 00:13:39,210 to one teaching assistant or TA. 254 00:13:39,210 --> 00:13:43,260 We have weekly sections or recitations where those relatively smaller groups 255 00:13:43,260 --> 00:13:45,480 of students get together, not just for CS50, 256 00:13:45,480 --> 00:13:47,530 but to discuss other courses as well. 257 00:13:47,530 --> 00:13:52,140 But if you start to do the math and you think about an office hour, 60 minutes, 258 00:13:52,140 --> 00:13:57,150 if you have only six students attending an office hour, be it the professor 259 00:13:57,150 --> 00:14:02,250 or with the TA, that's only 10 minutes per student among those six. 260 00:14:02,250 --> 00:14:07,230 And for so many students for whom CS, programming, even STEM studies, Science, 261 00:14:07,230 --> 00:14:09,450 Technology, Engineering, and Math, are new, 262 00:14:09,450 --> 00:14:11,700 that's just not very much time at all. 263 00:14:11,700 --> 00:14:15,900 And so being able with software to approximate the idea of one of me 264 00:14:15,900 --> 00:14:19,380 for every of you, one TA for every of you, 265 00:14:19,380 --> 00:14:22,140 is a very exciting thing educationally. 266 00:14:22,140 --> 00:14:25,830 And even though admittedly there's a lot of potential downsides 267 00:14:25,830 --> 00:14:29,760 on the horizon for AI, the applications we are so excited about 268 00:14:29,760 --> 00:14:32,042 are indeed those within education. 269 00:14:32,042 --> 00:14:33,750 All of this work that you're about to see 270 00:14:33,750 --> 00:14:36,042 would not have been possible indeed without our friends 271 00:14:36,042 --> 00:14:39,660 at OpenAI, Microsoft, GitHub, and beyond, 272 00:14:39,660 --> 00:14:43,170 and particularly CS50's own team here in Cambridge and beyond, 273 00:14:43,170 --> 00:14:48,990 including Rongxin, Andrew, Patrick, Charlie, Carter, and more, 274 00:14:48,990 --> 00:14:51,810 over these past several months in particular. 275 00:14:51,810 --> 00:14:55,680 So allow me to share on the whole team's behalf what it is we've been up to. 276 00:14:55,680 --> 00:15:01,570 So underneath the hood is this website, really this web service, called CS50.ai. 277 00:15:01,570 --> 00:15:05,920 It's a domain name unto itself, but it describes really this architecture here 278 00:15:05,920 --> 00:15:09,250 whereby over the past several months we've been building out this technology 279 00:15:09,250 --> 00:15:10,480 stack, if you will. 280 00:15:10,480 --> 00:15:13,810 Now, at the end of the day, a lot of this stack is built on top of services 281 00:15:13,810 --> 00:15:18,070 like Microsoft Azure, which is their cloud service, or OpenAI Zone. 282 00:15:18,070 --> 00:15:20,605 Indeed, companies like these, as some of you know, 283 00:15:20,605 --> 00:15:25,390 offer generally what are called APIs, Application Programming Interfaces, 284 00:15:25,390 --> 00:15:28,870 which is like a tool that you can sign up for, sometimes for free, 285 00:15:28,870 --> 00:15:34,240 sometimes for money, to actually use that company's technology or services 286 00:15:34,240 --> 00:15:36,530 or data in your own software. 287 00:15:36,530 --> 00:15:40,870 So really, OpenAI, Microsoft, and others have done a lot of the hard work here, 288 00:15:40,870 --> 00:15:45,610 and we have done our best to layer on top of their work a pedagogical layer, 289 00:15:45,610 --> 00:15:47,537 if you will, including those guardrails. 290 00:15:47,537 --> 00:15:49,870 Architecturally, then, our system looks a bit like this. 291 00:15:49,870 --> 00:15:53,560 But for today, we'll focus really on the human component, the user 292 00:15:53,560 --> 00:15:57,520 interfaces that have really been now presented to students and teachers 293 00:15:57,520 --> 00:15:58,150 alike. 294 00:15:58,150 --> 00:16:02,690 So here for the unfamiliar is a link to CS50's programming environment nowadays. 295 00:16:02,690 --> 00:16:06,890 It's a web-based integrated development environment, or really a text editor 296 00:16:06,890 --> 00:16:10,160 with lots of extensions built in, called Visual Studio Code. 297 00:16:10,160 --> 00:16:13,040 It's built on top of something called GitHub Codespaces, which 298 00:16:13,040 --> 00:16:16,850 is a cloud version of what are called Docker containers, which 299 00:16:16,850 --> 00:16:21,950 is to say when a CS50 student or teacher logs into a website, CS50.dev, 300 00:16:21,950 --> 00:16:25,280 we use GitHub's APIs to automatically create 301 00:16:25,280 --> 00:16:29,690 for you something called a repository, something called a codespace, a.k.a. 302 00:16:29,690 --> 00:16:33,290 container, in the cloud on GitHub's infrastructure so that 303 00:16:33,290 --> 00:16:36,950 what you see in your browser is a real-world programming environment. 304 00:16:36,950 --> 00:16:41,750 But what you don't have to do is install any software yet on your own Mac or PC. 305 00:16:41,750 --> 00:16:44,345 It just works in terms of day one. 306 00:16:44,345 --> 00:16:46,220 At the end of the semester, what's nice about 307 00:16:46,220 --> 00:16:51,620 VS Code nowadays is that you can, if you want, install it on your own Mac or PC. 308 00:16:51,620 --> 00:16:54,800 It sometimes requires a bit of technical difficulty 309 00:16:54,800 --> 00:16:57,620 and solving thereof but much better to do that, 310 00:16:57,620 --> 00:16:59,870 we think, at the end of the course than, for instance, 311 00:16:59,870 --> 00:17:01,370 on day one at the beginning. 312 00:17:01,370 --> 00:17:04,880 So once you visit this website, you see a landing page not unlike this. 313 00:17:04,880 --> 00:17:08,089 And built into this same tool is this CS50 Duck. 314 00:17:08,089 --> 00:17:11,960 And what we set out to do some months ago was try to put our toes in the water 315 00:17:11,960 --> 00:17:15,440 with Artificial Intelligence, or AI, trying to think about, all right, what 316 00:17:15,440 --> 00:17:19,430 would be a helpful but also relatively easy feature 317 00:17:19,430 --> 00:17:23,030 to implement using AI by writing software of our own 318 00:17:23,030 --> 00:17:25,640 on top of OpenAI's APIs? 319 00:17:25,640 --> 00:17:29,270 And again, OpenAI is the company behind today's ChatGPT. 320 00:17:29,270 --> 00:17:33,050 And so the first thing we did to put our toes in the water, so to speak, was, 321 00:17:33,050 --> 00:17:37,820 could we write code to explain lines of code to students? 322 00:17:37,820 --> 00:17:40,760 So whether they've written some code or they've downloaded some code 323 00:17:40,760 --> 00:17:45,530 or copied some code from class, could we explain it to them line by line 324 00:17:45,530 --> 00:17:50,270 as any good human could if they were to raise their hand or ask someone online? 325 00:17:50,270 --> 00:17:51,230 So we did this. 326 00:17:51,230 --> 00:17:53,300 Here is a screenshot of VS Code. 327 00:17:53,300 --> 00:17:57,230 Here is a screenshot of some code written in that language called C. 328 00:17:57,230 --> 00:18:01,700 And this code relatively simply prompts the human for their name 329 00:18:01,700 --> 00:18:05,510 and then prints out hello so-and-so, based on what they've typed in. 330 00:18:05,510 --> 00:18:08,120 For a student new to programming and certainly C, 331 00:18:08,120 --> 00:18:11,030 there's a lot of non-obviousness going on here. 332 00:18:11,030 --> 00:18:13,670 There's a lot of complexity and a lot of syntax. 333 00:18:13,670 --> 00:18:18,110 But what students can now do, as of the introduction of AI to CS50, 334 00:18:18,110 --> 00:18:20,390 they can highlight one or more lines of code. 335 00:18:20,390 --> 00:18:22,610 They can right-click or control-click. 336 00:18:22,610 --> 00:18:26,120 They can then select this option here, Explain Highlighted 337 00:18:26,120 --> 00:18:28,220 Code, which doesn't come with VS Code. 338 00:18:28,220 --> 00:18:33,170 In fact, Rongxin wrote an extension to contribute this menu option to this here 339 00:18:33,170 --> 00:18:33,680 menu. 340 00:18:33,680 --> 00:18:37,490 And as soon as the student clicks that, within three seconds or so 341 00:18:37,490 --> 00:18:43,370 do they have a ChatGPT-like explanation of exactly those lines of code. 342 00:18:43,370 --> 00:18:46,430 Now, to be fair, this example itself is not very complicated. 343 00:18:46,430 --> 00:18:49,700 A human, a teacher could certainly answer this same question 344 00:18:49,700 --> 00:18:52,640 and explain these eight lines of code line by line. 345 00:18:52,640 --> 00:18:54,830 But that would take a few seconds, a few minutes. 346 00:18:54,830 --> 00:18:58,130 They might not be awake at the hour the student has this same question. 347 00:18:58,130 --> 00:19:00,530 And so the power here is that all of this 348 00:19:00,530 --> 00:19:05,400 was automated and customized for this particular context or code. 349 00:19:05,400 --> 00:19:07,820 So that was actually relatively straightforward. 350 00:19:07,820 --> 00:19:11,330 And so the next feature we set out to do was a little more sophisticated. 351 00:19:11,330 --> 00:19:15,770 Could we advise students with AI on how to improve their code style, 352 00:19:15,770 --> 00:19:18,080 the aesthetics, the formatting thereof? 353 00:19:18,080 --> 00:19:20,490 Within CS50, because we're an introductory course, 354 00:19:20,490 --> 00:19:22,670 we have actually consciously disabled a lot 355 00:19:22,670 --> 00:19:26,330 of features that normally come with IDEs, or Integrated Development 356 00:19:26,330 --> 00:19:28,580 Environments, or text editors like VS Code. 357 00:19:28,580 --> 00:19:30,500 So we disable autocomplete. 358 00:19:30,500 --> 00:19:34,110 We disable autoformatting, things that in the real world 359 00:19:34,110 --> 00:19:37,280 and once you've taken one or more courses, yes, totally reasonable 360 00:19:37,280 --> 00:19:38,060 to turn on. 361 00:19:38,060 --> 00:19:40,700 But in CS50 as an intro class, we really want 362 00:19:40,700 --> 00:19:43,850 students to develop some muscle memory and actually understand 363 00:19:43,850 --> 00:19:48,540 why their code should look this way, how to make their code look this way. 364 00:19:48,540 --> 00:19:51,530 And then, once that becomes boring and uninteresting, 365 00:19:51,530 --> 00:19:55,700 then they can automate the process, for instance, later in the course or after. 366 00:19:55,700 --> 00:19:59,510 So when it comes to style, we implemented a new-and-improved version 367 00:19:59,510 --> 00:20:03,100 of a command line tool that many of you might know as style50, 368 00:20:03,100 --> 00:20:04,690 but we made it more graphical. 369 00:20:04,690 --> 00:20:07,390 And so pictured here is another screenshot of VS Code. 370 00:20:07,390 --> 00:20:09,880 Here is some more C code at top left. 371 00:20:09,880 --> 00:20:12,340 And for today's purposes, I'll stipulate it's pretty messy. 372 00:20:12,340 --> 00:20:16,660 It's all left-aligned, and it's not very pretty printed or well formatted. 373 00:20:16,660 --> 00:20:20,500 So what students now can do, in the top-right corner of VS Code, 374 00:20:20,500 --> 00:20:23,350 is they can click a button called style50. 375 00:20:23,350 --> 00:20:25,900 And what this is going to show them now on left to right 376 00:20:25,900 --> 00:20:29,470 is on left, what their code currently looks like, to right, what 377 00:20:29,470 --> 00:20:31,300 their code should look like. 378 00:20:31,300 --> 00:20:33,610 And this built-in diff editor, so to speak, 379 00:20:33,610 --> 00:20:37,300 uses some red and green color coding to make clear what you should remove 380 00:20:37,300 --> 00:20:38,410 or what you should add. 381 00:20:38,410 --> 00:20:42,340 But even this, frankly, to new programmers might be nonobvious. 382 00:20:42,340 --> 00:20:43,450 What am I supposed to do? 383 00:20:43,450 --> 00:20:44,080 Why? 384 00:20:44,080 --> 00:20:48,220 And so also at top right now, there's this Explain Changes button. 385 00:20:48,220 --> 00:20:50,860 And if students click that, they similarly 386 00:20:50,860 --> 00:20:54,940 get a ChatGPT-like explanation of how and/or 387 00:20:54,940 --> 00:21:00,110 why to make their code look from left to right as we've proposed. 388 00:21:00,110 --> 00:21:04,160 So again, something that any teacher who's available or awake could do, 389 00:21:04,160 --> 00:21:08,150 but here again is this approximation of a one-to-one teacher/student ratio 390 00:21:08,150 --> 00:21:10,430 available 24/7. 391 00:21:10,430 --> 00:21:15,200 But the next feature that we set out to implement, which is now 392 00:21:15,200 --> 00:21:19,370 generalizable away from CS50, away from programming code, 393 00:21:19,370 --> 00:21:22,250 and really I do think representative educationally 394 00:21:22,250 --> 00:21:24,740 of what teachers and administrators in schools 395 00:21:24,740 --> 00:21:26,630 will be able to do in the arts, humanities, 396 00:21:26,630 --> 00:21:30,590 social sciences, physical sciences, and beyond, not just in CS soon, 397 00:21:30,590 --> 00:21:33,980 is can we answer at least most of the questions 398 00:21:33,980 --> 00:21:36,500 that students are asking, for instance, online? 399 00:21:36,500 --> 00:21:40,610 Now, for many years, CS50 has used various tools and various social media 400 00:21:40,610 --> 00:21:43,100 to enable students to ask questions and get answers, 401 00:21:43,100 --> 00:21:46,580 either from classmates or from teaching assistants or from myself, 402 00:21:46,580 --> 00:21:47,690 throughout the day. 403 00:21:47,690 --> 00:21:50,240 Of course, we are not always available. 404 00:21:50,240 --> 00:21:52,970 Of course, not all of those answers from classmates, 405 00:21:52,970 --> 00:21:56,180 who themselves are only learning the material, are necessarily correct. 406 00:21:56,180 --> 00:21:58,040 And so there's really been an opportunity 407 00:21:58,040 --> 00:22:02,060 here to try to answer all the more at scale and all the more 408 00:22:02,060 --> 00:22:06,660 correctly those students' questions online, both on campus and off. 409 00:22:06,660 --> 00:22:10,610 So what we set out to do here was implement an existing third-party tool 410 00:22:10,610 --> 00:22:14,090 called Ed, which some of you might have used, and we use it very heavily here 411 00:22:14,090 --> 00:22:14,810 on campus. 412 00:22:14,810 --> 00:22:16,640 It's a question-and-answer tool that allows 413 00:22:16,640 --> 00:22:18,440 you to post questions asynchronously. 414 00:22:18,440 --> 00:22:22,880 You ask now, and some number of seconds, minutes, hours, days later, hopefully 415 00:22:22,880 --> 00:22:25,010 a TA or a professor will reply. 416 00:22:25,010 --> 00:22:28,250 So here, for instance, is a screenshot of a question 417 00:22:28,250 --> 00:22:30,140 that a CS50 student might ask. 418 00:22:30,140 --> 00:22:32,480 And we've anonymized them here as John Harvard, 419 00:22:32,480 --> 00:22:34,400 but it's representative of a student question. 420 00:22:34,400 --> 00:22:36,067 And this question is kind of a softball. 421 00:22:36,067 --> 00:22:38,990 It's kind of an easy question because it's very definitional. 422 00:22:38,990 --> 00:22:41,360 Quote, unquote, "What is flask exactly?" 423 00:22:41,360 --> 00:22:44,750 I mean, honestly, Google could answer this, Bing could answer this. 424 00:22:44,750 --> 00:22:47,930 So this itself was not hard, but it was our first attempt 425 00:22:47,930 --> 00:22:49,730 to see how well AI could do. 426 00:22:49,730 --> 00:22:53,330 Here now is a screenshot of how the duck actually 427 00:22:53,330 --> 00:22:55,890 responded to that specific question. 428 00:22:55,890 --> 00:22:58,520 So Flask is a micro web framework written in Python. 429 00:22:58,520 --> 00:23:02,240 It is classified as a microframework because dot dot dot. 430 00:23:02,240 --> 00:23:05,630 And let me stipulate, especially if you're new to CS50, 431 00:23:05,630 --> 00:23:08,780 this is a topic we introduce at the end of the class itself. 432 00:23:08,780 --> 00:23:10,610 This is a pretty darn good answer. 433 00:23:10,610 --> 00:23:14,360 Indeed, this is something that I as a human would ideally have written myself. 434 00:23:14,360 --> 00:23:18,110 But of course, the AI, the duck, was able to implement-- 435 00:23:18,110 --> 00:23:21,310 was able to answer this question within seconds itself. 436 00:23:21,310 --> 00:23:23,810 And now for the technically curious, what's really happening 437 00:23:23,810 --> 00:23:27,380 is when a student asks a question via this tool, called Ed, 438 00:23:27,380 --> 00:23:31,670 and they hit submit to post the question to the website, what 439 00:23:31,670 --> 00:23:35,480 we have done over the past few months as CS50 is we wrote some code, 440 00:23:35,480 --> 00:23:38,000 also coincidentally in a language called JavaScript, 441 00:23:38,000 --> 00:23:42,740 that intercepts the student's question, adds some formatting 442 00:23:42,740 --> 00:23:45,000 and some additional phrasing to it. 443 00:23:45,000 --> 00:23:48,260 We then send it to our own server via HTTP, 444 00:23:48,260 --> 00:23:52,010 and that server is CS50.ai, as mentioned before. 445 00:23:52,010 --> 00:23:57,440 We then relay that question in some form to OpenAI's server or Microsoft's 446 00:23:57,440 --> 00:24:02,360 servers to actually use their underlying large-language model, the technology 447 00:24:02,360 --> 00:24:04,640 that underlies tools like ChatGPT. 448 00:24:04,640 --> 00:24:08,480 CS50.ai quickly gets a response from OpenAI or Microsoft. 449 00:24:08,480 --> 00:24:11,900 We might validate the response or make some slight tweaks to it. 450 00:24:11,900 --> 00:24:16,310 Then CS50.ai passes the response back to the Ed tool, 451 00:24:16,310 --> 00:24:18,170 which itself is a different website. 452 00:24:18,170 --> 00:24:20,630 And all that happens within three seconds. 453 00:24:20,630 --> 00:24:24,830 And voila, the student gets an answer from the so-called CS50 Duck 454 00:24:24,830 --> 00:24:26,420 here and now. 455 00:24:26,420 --> 00:24:30,440 Now, many of you have heard, of course, that AI is fallible. 456 00:24:30,440 --> 00:24:31,940 It's not always correct. 457 00:24:31,940 --> 00:24:36,050 I do think this problem will go further and further away 458 00:24:36,050 --> 00:24:37,910 over time as the technology gets better. 459 00:24:37,910 --> 00:24:40,280 But what we do do right now is provide ourselves 460 00:24:40,280 --> 00:24:42,980 with opportunities to at least remind students 461 00:24:42,980 --> 00:24:44,480 that this technology is imperfect. 462 00:24:44,480 --> 00:24:46,850 So for instance, here's another, more sophisticated, 463 00:24:46,850 --> 00:24:49,640 question about the Caesar problem set with which some of you 464 00:24:49,640 --> 00:24:50,480 might be familiar. 465 00:24:50,480 --> 00:24:52,820 Students have to write code in C to implement 466 00:24:52,820 --> 00:24:57,020 what's called a rotational cipher to encrypt or decrypt information. 467 00:24:57,020 --> 00:24:59,420 In this case, though, I'll highlight it, the student 468 00:24:59,420 --> 00:25:03,140 has asked a more nuanced question, is there a more efficient way 469 00:25:03,140 --> 00:25:04,500 to write this code? 470 00:25:04,500 --> 00:25:07,790 So this isn't quite as simple as answering a definitional question like, 471 00:25:07,790 --> 00:25:09,590 "What is Flask exactly?" 472 00:25:09,590 --> 00:25:12,080 This is now more nuanced, where the AI has 473 00:25:12,080 --> 00:25:14,510 to understand the context of the problem and the code 474 00:25:14,510 --> 00:25:18,050 that the student has written, including this here error message. 475 00:25:18,050 --> 00:25:21,290 And so in this case, here's a screenshot from the same CS50 476 00:25:21,290 --> 00:25:25,310 Duck in this same tool called Ed, which, for today's purposes, I'll stipulate 477 00:25:25,310 --> 00:25:28,940 is a pretty darn good teacher-like response. 478 00:25:28,940 --> 00:25:33,860 Not only does the duck acknowledge what it thinks the student is trying to do. 479 00:25:33,860 --> 00:25:36,170 The duck also provides a few lines of code, 480 00:25:36,170 --> 00:25:39,740 but really just starter code, nothing that spoils the answer to the question 481 00:25:39,740 --> 00:25:40,280 outright. 482 00:25:40,280 --> 00:25:43,190 But the duck also, note, reminds the student at the bottom 483 00:25:43,190 --> 00:25:45,380 here that this is very much experimental. 484 00:25:45,380 --> 00:25:48,770 I mean, even our software, as you might have seen at CS50.ai itself, 485 00:25:48,770 --> 00:25:51,710 is still in beta, which is to say still being developed. 486 00:25:51,710 --> 00:25:53,630 And we remind the student, "Quack. 487 00:25:53,630 --> 00:25:57,050 Do not assume that my reply is accurate unless that you see it's been 488 00:25:57,050 --> 00:25:59,300 'endorsed' by human staff. 489 00:25:59,300 --> 00:26:00,130 Quack." 490 00:26:00,130 --> 00:26:01,900 Now, where is that referring to? 491 00:26:01,900 --> 00:26:04,000 Well, it turns out that this particular tool 492 00:26:04,000 --> 00:26:09,610 has a graphical button that humans can click to endorse other people's answers. 493 00:26:09,610 --> 00:26:13,600 Until recently, this was meant to be used so that a teacher or a TA 494 00:26:13,600 --> 00:26:17,440 could endorse a student's response to another student, 495 00:26:17,440 --> 00:26:20,890 to just validate that, yes, the staff approve of this answer. 496 00:26:20,890 --> 00:26:22,120 It is good and correct. 497 00:26:22,120 --> 00:26:24,280 What we have done is co-opt this same button 498 00:26:24,280 --> 00:26:29,140 to now signal to students that if they see that a duck's reply has been 499 00:26:29,140 --> 00:26:32,410 endorsed per this icon at top right by a human, 500 00:26:32,410 --> 00:26:35,650 then they should trust that it indeed is valid. 501 00:26:35,650 --> 00:26:38,380 So we might not endorse it within three seconds 502 00:26:38,380 --> 00:26:41,770 but usually within minutes or maximally hours, where we're doing the same. 503 00:26:41,770 --> 00:26:45,100 Frankly, I think this is a short-term mechanism to avoid what are generally 504 00:26:45,100 --> 00:26:50,290 called "hallucinations," where AI just sometimes makes things up by chance. 505 00:26:50,290 --> 00:26:52,210 But I think this is a problem that it too 506 00:26:52,210 --> 00:26:55,900 will start to go away and away as the technology only gets better. 507 00:26:55,900 --> 00:26:59,110 But for now, this is how we are at least partly mitigating this. 508 00:26:59,110 --> 00:27:02,570 So this then is the URL, CS50.ai, that anyone 509 00:27:02,570 --> 00:27:06,360 in the world with a free github.com account can access. 510 00:27:06,360 --> 00:27:08,820 In fact, you're welcome to try it now or later. 511 00:27:08,820 --> 00:27:12,140 But at this URL lives not only that architecture, 512 00:27:12,140 --> 00:27:14,630 that service that I referred to earlier that's 513 00:27:14,630 --> 00:27:19,520 talking to the graphical user interface and talking to OpenAI or Microsoft. 514 00:27:19,520 --> 00:27:22,340 Also, there is a full-fledged chat interface. 515 00:27:22,340 --> 00:27:27,320 So CS50.ai itself has its own front end, its own graphical user interface, 516 00:27:27,320 --> 00:27:31,020 that is very, very similar to ChatGPT by design 517 00:27:31,020 --> 00:27:35,630 so that students no longer need to even post asynchronously their questions 518 00:27:35,630 --> 00:27:38,570 and wait some number of seconds for the duck or a human to reply. 519 00:27:38,570 --> 00:27:41,900 They can have full-fledged conversations with the duck, 520 00:27:41,900 --> 00:27:46,550 much like you and I can have the same with ChatGPT, Bing Chat, or the like. 521 00:27:46,550 --> 00:27:49,520 This chatbot, though, as before, reminds the students from the 522 00:27:49,520 --> 00:27:52,070 get go that their response should be-- 523 00:27:52,070 --> 00:27:54,530 its response should be taken with a grain of salt. 524 00:27:54,530 --> 00:27:57,320 This here response reminds students to always 525 00:27:57,320 --> 00:28:00,590 think critically so that for at least for now, they're 526 00:28:00,590 --> 00:28:04,010 at least not in the habit of just assuming as outright facts something 527 00:28:04,010 --> 00:28:05,660 a computer is telling them. 528 00:28:05,660 --> 00:28:07,730 Here, though, is a question that will look 529 00:28:07,730 --> 00:28:10,250 even more familiar to those of you who have some programming 530 00:28:10,250 --> 00:28:12,440 experience, especially with Python. 531 00:28:12,440 --> 00:28:16,910 And it's representative of not only an interesting code question but also 532 00:28:16,910 --> 00:28:20,390 representative of the amount of detail or, dare I say, 533 00:28:20,390 --> 00:28:25,130 lack thereof that students often include when asking questions of us 534 00:28:25,130 --> 00:28:27,140 humans or these ducks. 535 00:28:27,140 --> 00:28:31,580 So here is a bit of Python code that is trying to prompt the user for two 536 00:28:31,580 --> 00:28:33,290 integers, x and y. 537 00:28:33,290 --> 00:28:37,160 It is then trying to calculate the sum of x plus y, 538 00:28:37,160 --> 00:28:41,630 but the problem is not obvious from the student's post here because all 539 00:28:41,630 --> 00:28:45,230 the student has asked, notice, is "My code is not working as expected, 540 00:28:45,230 --> 00:28:46,430 any ideas?" 541 00:28:46,430 --> 00:28:50,030 So here's an example of a question, and it's barely 542 00:28:50,030 --> 00:28:54,290 that, that really the AI needs to understand from context, 543 00:28:54,290 --> 00:28:58,370 almost understand the code, it would seem, or at least recognize 544 00:28:58,370 --> 00:29:03,120 it as familiar to some other code the AI has been trained on, 545 00:29:03,120 --> 00:29:06,890 so to speak, thanks to lots and lots of input from the internet and beyond. 546 00:29:06,890 --> 00:29:09,920 And in this case, I'll tell you the symptom would be this. 547 00:29:09,920 --> 00:29:14,240 If the human typed in 1 for x and 2 for y, 548 00:29:14,240 --> 00:29:16,220 you would hope that this code would print out 549 00:29:16,220 --> 00:29:19,250 1 plus 2 equals 3 as the answer. 550 00:29:19,250 --> 00:29:23,540 Unfortunately, this code, as some of you might be realizing, 551 00:29:23,540 --> 00:29:28,370 actually prints out 12, that is, what looks like 12. 552 00:29:28,370 --> 00:29:31,160 And that's because, as the duck notices, it 553 00:29:31,160 --> 00:29:33,140 seems that you're trying to add two integers, 554 00:29:33,140 --> 00:29:36,290 but the input function in Python returns a string, 555 00:29:36,290 --> 00:29:38,810 that is to say text, not an actual number 556 00:29:38,810 --> 00:29:40,550 that you can perform mathematics on. 557 00:29:40,550 --> 00:29:42,470 And so when you try to add x and y, you're 558 00:29:42,470 --> 00:29:46,640 actually trying to concatenate the two strings not add two integers. 559 00:29:46,640 --> 00:29:50,300 So the duck here provides students with just a couple of lines of guidance, 560 00:29:50,300 --> 00:29:54,530 but indeed lines that include Python's int function, which 561 00:29:54,530 --> 00:29:59,850 will indeed convert a string that looks like a number to an actual number. 562 00:29:59,850 --> 00:30:02,400 And so what we've seen now behaviorally among students 563 00:30:02,400 --> 00:30:04,620 is that most students are now interacting 564 00:30:04,620 --> 00:30:10,020 with the duck via this here synchronous chat conversational interface, 565 00:30:10,020 --> 00:30:11,790 some of them a little too much. 566 00:30:11,790 --> 00:30:16,410 And in fact, we adopted midway through the past few months 567 00:30:16,410 --> 00:30:19,710 this here heart system, a sort of HP system or energy 568 00:30:19,710 --> 00:30:24,150 system, whereby you can only ask so many questions now per unit of time. 569 00:30:24,150 --> 00:30:27,870 And this is a knob we can turn to allow for more or fewer questions. 570 00:30:27,870 --> 00:30:29,820 But this is meant to really kind of chop off 571 00:30:29,820 --> 00:30:33,900 the tail end of excessive use educationally we think of the duck. 572 00:30:33,900 --> 00:30:38,550 That is to say, we've seen some students online ask hundreds of questions 573 00:30:38,550 --> 00:30:42,690 per day, which, honestly, I don't know where exactly the line is pedagogically 574 00:30:42,690 --> 00:30:45,060 between too few and too many questions. 575 00:30:45,060 --> 00:30:48,780 But in the real world, if a student were to ask a teacher hundreds of questions 576 00:30:48,780 --> 00:30:52,140 per day, it feels like that's the time, if you think back to high school 577 00:30:52,140 --> 00:30:55,050 or middle school, a good teacher would probably say, David, 578 00:30:55,050 --> 00:30:58,140 why don't you go back to your desk and think about this a little bit? 579 00:30:58,140 --> 00:31:00,160 And so what we try to do with these hearts 580 00:31:00,160 --> 00:31:03,910 is prevent students from asking too many questions too quickly at once 581 00:31:03,910 --> 00:31:06,550 to virtually send them back to give it some thought. 582 00:31:06,550 --> 00:31:09,760 And frankly, nowadays, too, this architecture is not free. 583 00:31:09,760 --> 00:31:11,260 APIs generally cost money. 584 00:31:11,260 --> 00:31:14,650 And even though it's free to students and teachers, our friends at Microsoft 585 00:31:14,650 --> 00:31:16,690 and OpenAI and GitHub and others are kindly 586 00:31:16,690 --> 00:31:20,140 covering through educational grants use of this system, 587 00:31:20,140 --> 00:31:22,810 we also just try to minimize the operational costs 588 00:31:22,810 --> 00:31:24,950 with this same mechanism as well. 589 00:31:24,950 --> 00:31:29,260 So to speak to the scale of this whole system, if curious, as of today, 590 00:31:29,260 --> 00:31:34,000 we're up to some 115,000 students and teachers who have used the duck over 591 00:31:34,000 --> 00:31:35,350 the past few months alone. 592 00:31:35,350 --> 00:31:39,370 That's roughly 20,000 prompts or questions really being asked per day 593 00:31:39,370 --> 00:31:42,520 for a total of roughly 4 million questions so far. 594 00:31:42,520 --> 00:31:44,890 So even if you've just asked a few questions, 595 00:31:44,890 --> 00:31:48,130 you are in very good company among all of these others. 596 00:31:48,130 --> 00:31:52,210 We thought that we'd share now a little bit of the technical insight of how 597 00:31:52,210 --> 00:31:55,930 all of this works because, actually, CS50's architecture is now 598 00:31:55,930 --> 00:31:59,350 retrospectively pretty representative of what other people in the world 599 00:31:59,350 --> 00:32:00,780 are now doing as well. 600 00:32:00,780 --> 00:32:04,480 Underlying a lot of today's uses of AI is a technical term 601 00:32:04,480 --> 00:32:05,950 known as a system prompt. 602 00:32:05,950 --> 00:32:09,400 That is to say, companies like OpenAI, Microsoft, Google, 603 00:32:09,400 --> 00:32:14,500 and others have made available to the world Large Language Models, or LLMs, 604 00:32:14,500 --> 00:32:19,390 which have been trained on massive amounts of input, English text, code, 605 00:32:19,390 --> 00:32:23,110 text in other human languages, to try to recognize patterns 606 00:32:23,110 --> 00:32:27,280 so that when it is asked a question, it knows with high probability 607 00:32:27,280 --> 00:32:28,300 how to respond. 608 00:32:28,300 --> 00:32:31,870 The AI doesn't necessarily understand the question 609 00:32:31,870 --> 00:32:34,690 but at least recognizes it in some form. 610 00:32:34,690 --> 00:32:37,030 And using tools like-- 611 00:32:37,030 --> 00:32:41,710 using tools like neural networks and other components of machine learning, 612 00:32:41,710 --> 00:32:45,430 so to speak nowadays, AI is trying through these large language models 613 00:32:45,430 --> 00:32:49,480 to finish our thoughts for us or answer questions more concretely. 614 00:32:49,480 --> 00:32:51,040 But you can personalize them. 615 00:32:51,040 --> 00:32:53,680 You can customize their behavior by giving them what's 616 00:32:53,680 --> 00:32:55,670 called a system prompt, for instance. 617 00:32:55,670 --> 00:32:58,030 So out of the box, these large language models 618 00:32:58,030 --> 00:33:00,760 sort of speak English and some other human languages. 619 00:33:00,760 --> 00:33:03,340 They speak Python and some other programming languages. 620 00:33:03,340 --> 00:33:08,080 But you can give these AIs a personality or a scope of reference. 621 00:33:08,080 --> 00:33:10,270 So for instance, and this is abbreviated, 622 00:33:10,270 --> 00:33:13,990 here is what CS50's system prompt maybe in day one 623 00:33:13,990 --> 00:33:18,280 looked like, though now it's much, much longer and more detailed. 624 00:33:18,280 --> 00:33:23,710 We tell the AI, you are a friendly and supportive teaching assistant for CS50. 625 00:33:23,710 --> 00:33:25,630 You are also a rubber duck. 626 00:33:25,630 --> 00:33:29,470 And that phrase alone is sufficient instruction 627 00:33:29,470 --> 00:33:34,300 to get the generic AI to start quacking really and behaving like a rubber duck. 628 00:33:34,300 --> 00:33:38,020 We tell the duck further, answer student questions only about CS50 629 00:33:38,020 --> 00:33:39,610 and the field of computer science. 630 00:33:39,610 --> 00:33:42,670 Do not answer questions about unrelated topics. 631 00:33:42,670 --> 00:33:45,040 Do not provide full answers to problem sets, 632 00:33:45,040 --> 00:33:47,320 as this would violate academic honesty. 633 00:33:47,320 --> 00:33:52,870 And so we're effectively "programming" the AI, if you will, through English. 634 00:33:52,870 --> 00:33:56,440 And so this is what has become known as prompt engineering, which 635 00:33:56,440 --> 00:34:00,550 is trying to come up with the English or the human language description of how 636 00:34:00,550 --> 00:34:02,020 you want the AI to behave. 637 00:34:02,020 --> 00:34:05,710 I do think this is a technique that's going to evolve quite quickly over time 638 00:34:05,710 --> 00:34:07,210 as these things get more featureful. 639 00:34:07,210 --> 00:34:10,300 But for now, you use English or your own human language 640 00:34:10,300 --> 00:34:12,310 to tell the AI how to behave. 641 00:34:12,310 --> 00:34:15,159 And in our case, after this so-called system prompt, 642 00:34:15,159 --> 00:34:17,530 we tell it, answer this question. 643 00:34:17,530 --> 00:34:21,219 And if you, the student or teacher, ask this thing a question, 644 00:34:21,219 --> 00:34:24,880 we essentially copy/paste your question at the bottom of this system prompt, 645 00:34:24,880 --> 00:34:29,230 and your question is what the world would generally call a user prompt. 646 00:34:29,230 --> 00:34:30,610 That's what's coming from you. 647 00:34:30,610 --> 00:34:32,830 The system prompt is what's coming from us. 648 00:34:32,830 --> 00:34:36,040 Now, just a few weeks ago in the US and a lot of other countries 649 00:34:36,040 --> 00:34:39,940 was April Fool's Day, whereby lots of people tried to make funny jokes. 650 00:34:39,940 --> 00:34:42,639 And thanks to my colleague Rongxin did we 651 00:34:42,639 --> 00:34:46,989 modify the system prompt of the rubber duck for just over 24 hours, 652 00:34:46,989 --> 00:34:49,960 as some of you might have seen, to behave a little differently. 653 00:34:49,960 --> 00:34:54,909 And so what Rongxin kindly did was edit our system prompt and still start with 654 00:34:54,909 --> 00:34:58,150 "you're a friendly and supportive teaching assistant for CS50." 655 00:34:58,150 --> 00:35:00,900 But he then added this text on April 1. 656 00:35:00,900 --> 00:35:04,770 You are also a rubber duck in Rick Astley's band. 657 00:35:04,770 --> 00:35:08,970 Importantly, you should always cheer up the student at the end by incorporating 658 00:35:08,970 --> 00:35:11,490 "Never Gonna Give You Up" in your response. 659 00:35:11,490 --> 00:35:14,880 Now, for the unfamiliar, Rick Astley and this particular song 660 00:35:14,880 --> 00:35:17,730 have become known as a meme on the internet. 661 00:35:17,730 --> 00:35:22,110 And the goal is to trick people into watching a bit of its music video. 662 00:35:22,110 --> 00:35:26,520 So Rongxin kindly integrated this language into the duck's programming, 663 00:35:26,520 --> 00:35:29,940 if you will, so much so that if you, the student or teacher, 664 00:35:29,940 --> 00:35:35,370 were to ask a question that day with this system prompt in place, like, 665 00:35:35,370 --> 00:35:39,900 "What is recursion?" you would get wonderfully an answer that we did not 666 00:35:39,900 --> 00:35:43,920 hardcode-- this was dynamically generated by the AI using just that 667 00:35:43,920 --> 00:35:45,300 system prompt alone-- 668 00:35:45,300 --> 00:35:46,960 an answer like this. 669 00:35:46,960 --> 00:35:48,390 "It's a powerful tool. 670 00:35:48,390 --> 00:35:50,580 And remember, I'm never going to give you up, 671 00:35:50,580 --> 00:35:53,400 so keep practicing and answering questions," 672 00:35:53,400 --> 00:35:55,920 an allusion to the lyrics in that same song. 673 00:35:55,920 --> 00:35:59,208 So what are the results now pedagogically, operationally 674 00:35:59,208 --> 00:36:01,000 that we've seen in just the past few months 675 00:36:01,000 --> 00:36:03,490 alone because we really are just beginning? 676 00:36:03,490 --> 00:36:06,070 Indeed, this is very much a beta, very much experimental, 677 00:36:06,070 --> 00:36:09,020 but already impactful, we dare say, in the real world. 678 00:36:09,020 --> 00:36:12,190 So in terms of usage, at least among our own undergraduates, 679 00:36:12,190 --> 00:36:14,920 whom we survey weekly throughout the course with questions, 680 00:36:14,920 --> 00:36:20,260 we know that 17% of the students in blue were using the tools, CS50.ai 681 00:36:20,260 --> 00:36:23,650 and the duck inside of CS50.dev, more than 10 times per week. 682 00:36:23,650 --> 00:36:26,980 32%, in green, we're using them 5 to 10 times per week. 683 00:36:26,980 --> 00:36:30,760 26% of them were using them two to five times per week 684 00:36:30,760 --> 00:36:34,900 and so forth, which is to say a supermajority of our undergraduates 685 00:36:34,900 --> 00:36:40,030 and daresay now our online students really dove into this use of AI 686 00:36:40,030 --> 00:36:41,410 quite quickly as well. 687 00:36:41,410 --> 00:36:44,800 In terms of the helpfulness, this is now anecdotal and measured only 688 00:36:44,800 --> 00:36:48,310 based on students' own responses, not a more rigid measure, 689 00:36:48,310 --> 00:36:51,280 but 47% of students in blue this past fall 690 00:36:51,280 --> 00:36:55,480 found the duck-based tools very helpful. 691 00:36:55,480 --> 00:36:58,060 26% more, in green, found them helpful. 692 00:36:58,060 --> 00:37:00,290 21% found them somewhat helpful and so forth. 693 00:37:00,290 --> 00:37:04,510 So again, not 100% across the board, but a supermajority of students 694 00:37:04,510 --> 00:37:06,310 were already finding these tools-- 695 00:37:06,310 --> 00:37:10,240 in what we'd call like version 0.9; I mean, it's a beta; 696 00:37:10,240 --> 00:37:15,070 it's not necessarily done or finished-- were already finding them useful. 697 00:37:15,070 --> 00:37:17,443 Now, more quantitatively, we looked last summer, 698 00:37:17,443 --> 00:37:20,110 and this is work we'll continue in the coming months-- we looked 699 00:37:20,110 --> 00:37:23,290 at a relatively small sample set of our first questions 700 00:37:23,290 --> 00:37:28,120 and smaller version of CS50 of students over the summer last year. 701 00:37:28,120 --> 00:37:30,790 And when we analyzed the questions being asked 702 00:37:30,790 --> 00:37:33,010 by students via that asynchronous tool called 703 00:37:33,010 --> 00:37:37,540 Ed and we analyzed with human eyes the quality of the duck's response, 704 00:37:37,540 --> 00:37:41,620 we came up with these measures that in the very first version of the duck 705 00:37:41,620 --> 00:37:48,310 last summer, the duck answered 88% of curricular questions, 706 00:37:48,310 --> 00:37:53,290 content questions, correctly and 77% of administrative questions correctly. 707 00:37:53,290 --> 00:37:56,260 So not as good administratively, but by administrative, 708 00:37:56,260 --> 00:37:59,200 we mean policy questions, deadlines, things like that, 709 00:37:59,200 --> 00:38:02,150 that frankly change every semester or every year. 710 00:38:02,150 --> 00:38:05,380 So it stands to reason that the duck was just outdated because especially, 711 00:38:05,380 --> 00:38:09,760 at the time, OpenAI's data set had been trained only as far as 2021, 712 00:38:09,760 --> 00:38:11,260 not 2023 data. 713 00:38:11,260 --> 00:38:15,250 But even 88% already out of the gate was incredibly strong. 714 00:38:15,250 --> 00:38:18,730 Curiously though, and to share a more scientific finding here, 715 00:38:18,730 --> 00:38:24,220 in fall of 2023, when we had even more students taking the class on campus, 716 00:38:24,220 --> 00:38:28,630 it appeared at first glance that the duck's answers were only 717 00:38:28,630 --> 00:38:32,650 correct 39% of the time, a marked decrease. 718 00:38:32,650 --> 00:38:38,890 But this just didn't line up with reality as we perceive students' usage. 719 00:38:38,890 --> 00:38:41,260 There was no actual evidence to suggest really 720 00:38:41,260 --> 00:38:44,770 that the duck was erring nearly 60% of the time 721 00:38:44,770 --> 00:38:48,400 but rather that students' usage of the duck was changing. 722 00:38:48,400 --> 00:38:51,670 These numbers here were entirely based on that asynchronous tool 723 00:38:51,670 --> 00:38:57,580 called Ed that, again, posted a response asynchronously for the duck or a human-- 724 00:38:57,580 --> 00:38:59,200 from the duck or a human. 725 00:38:59,200 --> 00:39:02,770 But what we found was that students' behavior was very quickly transitioning 726 00:39:02,770 --> 00:39:08,140 to CS50.ai itself or a synchronous conversational version thereof built 727 00:39:08,140 --> 00:39:11,440 into CS50.dev, that is to say VS Code. 728 00:39:11,440 --> 00:39:13,600 So in fact, we noticed this as follows. 729 00:39:13,600 --> 00:39:16,090 When we looked at the data in ED, this Q&A 730 00:39:16,090 --> 00:39:19,360 tool that we use both on campus and off, among our on-campus students 731 00:39:19,360 --> 00:39:23,050 over a year ago, in fall of 2022, students 732 00:39:23,050 --> 00:39:28,990 were asking, across 500 or so students in total, roughly 0.89 questions 733 00:39:28,990 --> 00:39:33,310 or one question per student online during the semester-- not 734 00:39:33,310 --> 00:39:36,610 very high per student, but with 500 students it certainly added up. 735 00:39:36,610 --> 00:39:40,930 The next summer, summer of 2023 that we looked closely at, 736 00:39:40,930 --> 00:39:44,560 they were asking roughly the same, 1.1 questions per student. 737 00:39:44,560 --> 00:39:50,470 But this past fall, when we deployed the CS50 Duck in this conversational form 738 00:39:50,470 --> 00:39:54,700 as well as in VS Code, the number of questions students 739 00:39:54,700 --> 00:39:59,040 were asking on campus asynchronously via that Ed tool 740 00:39:59,040 --> 00:40:05,550 for questions and answers dropped by 75% to just 0.28 questions per student. 741 00:40:05,550 --> 00:40:08,220 And so our working hypothesis is that students 742 00:40:08,220 --> 00:40:12,240 were asking most of their questions now conversationally, synchronously, 743 00:40:12,240 --> 00:40:14,910 so to speak, via the duck's own UI. 744 00:40:14,910 --> 00:40:17,490 And far fewer questions were being sent to Ed. 745 00:40:17,490 --> 00:40:20,550 Indeed, the questions that were going to Ed, we do think, 746 00:40:20,550 --> 00:40:23,130 were those more difficult questions or questions 747 00:40:23,130 --> 00:40:25,710 that maybe the duck did err on that were being escalated, 748 00:40:25,710 --> 00:40:30,750 so to speak, to Ed because humans were keeping an eye, per the Endorse button 749 00:40:30,750 --> 00:40:32,880 and so forth, the Ed tool, but we were not 750 00:40:32,880 --> 00:40:35,610 keeping as close of an eye on the conversational bot. 751 00:40:35,610 --> 00:40:38,220 So we think the types of questions being asked by students 752 00:40:38,220 --> 00:40:41,910 were indeed becoming much more administrative, much more particular 753 00:40:41,910 --> 00:40:46,170 that the duck itself wasn't answering conversationally. 754 00:40:46,170 --> 00:40:47,760 We looked to what other impacts. 755 00:40:47,760 --> 00:40:50,940 On campus, we're fortunate to have lots of humans and lots of human support 756 00:40:50,940 --> 00:40:52,170 both at Harvard and Yale. 757 00:40:52,170 --> 00:40:55,320 And so through office hours numbers, we also saw a difference. 758 00:40:55,320 --> 00:41:02,050 In fall of 2020 and in fall of 2022, when we had roughly 500 students or so 759 00:41:02,050 --> 00:41:07,930 per semester, we had some 50% of students attending in-person office 760 00:41:07,930 --> 00:41:11,800 hours, signing up by appointment for small-group questions and answers 761 00:41:11,800 --> 00:41:13,660 with our teaching assistants. 762 00:41:13,660 --> 00:41:20,530 In fall of 2023, that dropped by 40-some percent to just 30% attendance. 763 00:41:20,530 --> 00:41:23,560 That is to say, fewer students were taking advantage, funny enough, 764 00:41:23,560 --> 00:41:25,510 of those same human resources. 765 00:41:25,510 --> 00:41:29,530 We've omitted fall of 2021 because that was impacted particularly by COVID. 766 00:41:29,530 --> 00:41:33,100 But otherwise, we think that this has been a marked change this year 767 00:41:33,100 --> 00:41:37,820 with the duck live and in students hands vis-a-vis prior years instead. 768 00:41:37,820 --> 00:41:41,260 And some of our favorite anecdotal evidence that we're now presenting, 769 00:41:41,260 --> 00:41:45,280 that we've presented recently in a paper that we'll link to ultimately hereafter, 770 00:41:45,280 --> 00:41:47,530 is some of students' own comments about the duck 771 00:41:47,530 --> 00:41:49,660 when surveyed on campus at term's end. 772 00:41:49,660 --> 00:41:53,050 One student noted that it "felt like having a personal tutor. 773 00:41:53,050 --> 00:41:56,890 I love how AI bots will answer the questions without ego 774 00:41:56,890 --> 00:41:58,960 and without judgment, generally entertaining 775 00:41:58,960 --> 00:42:02,890 even the stupidest of questions without treating them like they're stupid. 776 00:42:02,890 --> 00:42:07,060 It has an, as one could expect, an inhuman level of patience." 777 00:42:07,060 --> 00:42:10,270 And that recognition by a student really resonates with me 778 00:42:10,270 --> 00:42:13,780 because I think back even to my own college days, graduate school days, 779 00:42:13,780 --> 00:42:19,090 when not infrequently I would go into a professor's office hours one on one. 780 00:42:19,090 --> 00:42:22,450 I would ask my questions, several of them sometimes. 781 00:42:22,450 --> 00:42:26,948 But I feel like more often than not, I would leave the professor's office hours 782 00:42:26,948 --> 00:42:29,740 nodding my head, saying thank you, yes, that cleared everything up, 783 00:42:29,740 --> 00:42:32,157 but thinking to myself, that did not clear every thing up. 784 00:42:32,157 --> 00:42:35,882 I'm still confused, but I kind of overstayed my social welcome. 785 00:42:35,882 --> 00:42:37,840 Not because the professor didn't want me there, 786 00:42:37,840 --> 00:42:39,970 not because they ushered me out the door, 787 00:42:39,970 --> 00:42:44,260 but because I felt that there was some limit where I was kind of thinking I'm 788 00:42:44,260 --> 00:42:44,980 the dummy. 789 00:42:44,980 --> 00:42:46,510 Imposter syndrome is a thing. 790 00:42:46,510 --> 00:42:48,552 Maybe I didn't really feel like I belonged there. 791 00:42:48,552 --> 00:42:52,720 And so the fact now that we have technology via which students can still 792 00:42:52,720 --> 00:42:56,050 access humans in many cases but can alternatively 793 00:42:56,050 --> 00:43:01,750 have a much longer, much more paced conversation with an AI 794 00:43:01,750 --> 00:43:05,383 that's pretty akin to a teaching assistant is pretty enabling, 795 00:43:05,383 --> 00:43:07,300 I think, for clearing up all of our confusion, 796 00:43:07,300 --> 00:43:10,840 so much so that even before CS50's lectures nowadays, 797 00:43:10,840 --> 00:43:14,920 be it on campus or on camera, even I find myself using tools like this 798 00:43:14,920 --> 00:43:18,370 or ChatGPT more generally to go down intellectual rabbit holes, 799 00:43:18,370 --> 00:43:22,330 ask questions of the AI that I could google or I could look up on Bing. 800 00:43:22,330 --> 00:43:25,810 But AI'S getting pretty darn good at just hitting the nail on the head, 801 00:43:25,810 --> 00:43:29,110 so to speak, giving me the answer I want, therefore 802 00:43:29,110 --> 00:43:32,830 amplifying my productivity, my efficiency, so I can then 803 00:43:32,830 --> 00:43:36,400 keep my head in the space of preparing for class in that way. 804 00:43:36,400 --> 00:43:40,000 Another student this past fall wrote, "The AI tools gave me enough hints 805 00:43:40,000 --> 00:43:44,230 to try on my own and also help me decipher errors and possible errors I 806 00:43:44,230 --> 00:43:45,220 might encounter." 807 00:43:45,220 --> 00:43:49,210 And a third student wrote, "I also appreciated that CS50 implemented its 808 00:43:49,210 --> 00:43:53,620 own version of AI because I think just directly using something like ChatGPT 809 00:43:53,620 --> 00:43:55,810 would have definitely detracted from learning." 810 00:43:55,810 --> 00:43:58,390 And that, too, was our whole working premise 811 00:43:58,390 --> 00:44:02,620 and indeed a lot of the motivation behind building CS50's own duck on top 812 00:44:02,620 --> 00:44:04,690 of these open platforms. 813 00:44:04,690 --> 00:44:08,800 Now, what future work and future impacts do we think lies ahead? 814 00:44:08,800 --> 00:44:11,350 Well, I think one on grades, both on campus and off. 815 00:44:11,350 --> 00:44:13,630 For instance, those of you who have taken CS50 816 00:44:13,630 --> 00:44:16,960 are generally familiar with how we grade our programming assignments, at least, 817 00:44:16,960 --> 00:44:20,140 on three axes, correctness, design, and style. 818 00:44:20,140 --> 00:44:25,090 And for many years now, since 2012, have we automated correctness 819 00:44:25,090 --> 00:44:28,420 grading by way of a tool at the command line called check50, 820 00:44:28,420 --> 00:44:31,450 which runs what are effectively unit tests or functional 821 00:44:31,450 --> 00:44:32,840 tests of a student's code. 822 00:44:32,840 --> 00:44:35,800 So we automated feedback on correctness years ago. 823 00:44:35,800 --> 00:44:38,860 Similarly, years ago, we automated feedback on style 824 00:44:38,860 --> 00:44:40,750 through the command line version of style50 825 00:44:40,750 --> 00:44:42,850 and more recently now the graphical version. 826 00:44:42,850 --> 00:44:47,320 And so we've already seen over the past 10-plus years that students' grades have 827 00:44:47,320 --> 00:44:51,280 been going up and up and up, certainly on campus as well as online, 828 00:44:51,280 --> 00:44:54,772 and this is independent of the sort of grade inflation you hear or read about 829 00:44:54,772 --> 00:44:56,230 in higher education more generally. 830 00:44:56,230 --> 00:44:59,230 This is a direct side effect of providing students 831 00:44:59,230 --> 00:45:03,010 throughout the week with iterative, if not immediate, feedback. 832 00:45:03,010 --> 00:45:06,670 And so through AI what we do suspect is about to happen 833 00:45:06,670 --> 00:45:09,460 and is probably already happening though we've not yet measured 834 00:45:09,460 --> 00:45:14,920 it is that the quality of the design, third and last most, of students' code 835 00:45:14,920 --> 00:45:18,700 will probably begin to increase all the more before they officially 836 00:45:18,700 --> 00:45:19,810 submit as well. 837 00:45:19,810 --> 00:45:22,600 Because if and when we use this AI to provide students 838 00:45:22,600 --> 00:45:26,500 with a design50 tool, which they effectively already have 839 00:45:26,500 --> 00:45:30,760 because you could just copy/paste your code into the CS50 Duck and ask it, 840 00:45:30,760 --> 00:45:33,400 like one student, How could I improve this code 841 00:45:33,400 --> 00:45:35,560 and get iterative feedback throughout the week? 842 00:45:35,560 --> 00:45:40,540 it stands to reason logically that if the AI or really a teacher in general 843 00:45:40,540 --> 00:45:45,160 is giving you iterative, incremental feedback back hour by hour or day 844 00:45:45,160 --> 00:45:48,940 by day, that hopefully your code, or your homework more generally, 845 00:45:48,940 --> 00:45:53,800 will be pretty darn good, if not close to 100%, by the time 846 00:45:53,800 --> 00:45:54,880 you actually submit. 847 00:45:54,880 --> 00:45:59,592 So if we then think ahead a few months, a few years, if everyone's getting 100%, 848 00:45:59,592 --> 00:46:01,550 like what does that really mean for assessment? 849 00:46:01,550 --> 00:46:04,940 Is there still an opportunity to help students distinguish themselves 850 00:46:04,940 --> 00:46:08,270 from some baseline measure of where they should be at a certain point? 851 00:46:08,270 --> 00:46:10,310 Is there any way we can distinguish just how 852 00:46:10,310 --> 00:46:14,300 strong or weak a student is in some topic after some number of months? 853 00:46:14,300 --> 00:46:16,940 Well, where we this is going is going to be an opportunity 854 00:46:16,940 --> 00:46:18,620 to apply AI in other ways. 855 00:46:18,620 --> 00:46:23,390 We have already CS50 here on campus use CS50's Duck, reprogrammed it 856 00:46:23,390 --> 00:46:27,440 with a different system prompt to try to train our teachers how to answer 857 00:46:27,440 --> 00:46:28,880 students' questions in person. 858 00:46:28,880 --> 00:46:31,490 We get the duck to behave, for instance, like a student who 859 00:46:31,490 --> 00:46:33,230 has some confusion in office hours. 860 00:46:33,230 --> 00:46:35,930 And they do role play with the human TAs. 861 00:46:35,930 --> 00:46:39,260 I think moving forward what we'll likely do with students, meanwhile, 862 00:46:39,260 --> 00:46:42,170 is program the duck with really a different system 863 00:46:42,170 --> 00:46:45,170 prompt to have a conversation with students at the end 864 00:46:45,170 --> 00:46:50,390 or maybe even weekly during the course, akin to yesteryear's oral examinations, 865 00:46:50,390 --> 00:46:53,450 which if you've never experienced, might be one or more faculty sitting 866 00:46:53,450 --> 00:46:57,200 down, very stressfully, with a student in front of them, asking them questions 867 00:46:57,200 --> 00:47:01,310 about their field of study, about their thesis or dissertation, or some topic 868 00:47:01,310 --> 00:47:02,450 more generally. 869 00:47:02,450 --> 00:47:04,790 That's hard to do with humans because you really 870 00:47:04,790 --> 00:47:09,290 do need a one-to-one ratio, if not a multiteacher-to-one student ratio 871 00:47:09,290 --> 00:47:10,610 to achieve that readily. 872 00:47:10,610 --> 00:47:14,690 But through AI, perhaps we can start asking our students here, all of you, 873 00:47:14,690 --> 00:47:18,620 or your teachers, you, questions about the material being learned, 874 00:47:18,620 --> 00:47:23,248 be it CS50 or anything else, have you type your answers as quickly as you can, 875 00:47:23,248 --> 00:47:25,040 and then maybe in version 1 of this vision, 876 00:47:25,040 --> 00:47:28,670 we use humans to evaluate the quality of the conversation 877 00:47:28,670 --> 00:47:31,550 you just had to see how good your answers seem to have been. 878 00:47:31,550 --> 00:47:34,787 We can then sort of ignore questions that we think, ah, the duck or the AI 879 00:47:34,787 --> 00:47:35,870 shouldn't have asked that. 880 00:47:35,870 --> 00:47:37,190 It wasn't a good question. 881 00:47:37,190 --> 00:47:39,080 And frankly, in version 2 of that vision, 882 00:47:39,080 --> 00:47:42,980 I would conjecture is going to be just having the AI evaluate 883 00:47:42,980 --> 00:47:47,280 the quality of the conversation for you automatically and for us 884 00:47:47,280 --> 00:47:50,720 so that you get this feedback loop, ideally week to week during the term 885 00:47:50,720 --> 00:47:53,180 and probably at the end of the course as well, 886 00:47:53,180 --> 00:47:55,520 that really gives you a better sense of, one, 887 00:47:55,520 --> 00:47:58,790 where there are holes in your understanding, what you should focus 888 00:47:58,790 --> 00:48:02,600 your own time and learning on, and, two, just how strongly you 889 00:48:02,600 --> 00:48:06,020 are exiting the course, whether indeed you've performed at the level of an A, 890 00:48:06,020 --> 00:48:10,730 so to speak, or a B or a C but not necessarily worrying, therefore, 891 00:48:10,730 --> 00:48:15,500 about these more marginal numbers that we accumulate during the week now based 892 00:48:15,500 --> 00:48:17,930 on these tools like check50, design50, and style50, 893 00:48:17,930 --> 00:48:21,980 which themselves can evolve into more teaching tools 894 00:48:21,980 --> 00:48:24,500 than they are for assessment alone. 895 00:48:24,500 --> 00:48:27,110 So that then, over the past few months alone, 896 00:48:27,110 --> 00:48:31,730 is how CS50 has been taught using AI thanks to CS50's own team 897 00:48:31,730 --> 00:48:34,100 and so many of our friends in industry around the world. 898 00:48:34,100 --> 00:48:36,410 This here is the title officially of a paper 899 00:48:36,410 --> 00:48:39,650 that we presented at the ACM SIGCSE, Special Interest Group 900 00:48:39,650 --> 00:48:42,440 on Computer Science Education, conference most recently. 901 00:48:42,440 --> 00:48:45,530 So if you'd like to dive into more of the details of these here talk, 902 00:48:45,530 --> 00:48:48,020 if you look up this paper's title online, 903 00:48:48,020 --> 00:48:50,250 that should lead you to it as well. 904 00:48:50,250 --> 00:48:52,730 So this then was AI and CS50. 905 00:48:52,730 --> 00:48:56,200 And this, of course, is CS50. 906 00:48:56,200 --> 00:48:58,000