1 00:00:00,000 --> 00:00:01,083 DAVID MALAN: Hello, world. 2 00:00:01,083 --> 00:00:04,620 My name is David Malan, and I'm here in Sanders Theatre at Harvard University 3 00:00:04,620 --> 00:00:06,720 where I teach CS50, Harvard's introduction 4 00:00:06,720 --> 00:00:08,940 to the intellectual enterprises of computer science 5 00:00:08,940 --> 00:00:11,940 and the arts of programming, an introductory course for majors and non 6 00:00:11,940 --> 00:00:13,020 majors alike. 7 00:00:13,020 --> 00:00:16,010 CS50 is among Harvard's largest courses, and indeed in healthier times, 8 00:00:16,010 --> 00:00:18,510 the seats behind me would look a little something like this. 9 00:00:18,510 --> 00:00:21,030 But the course is also freely available as OpenCourseWare 10 00:00:21,030 --> 00:00:23,970 as CS50x via platforms like YouTube, and edX, and beyond. 11 00:00:23,970 --> 00:00:26,970 And the course is additionally available to teachers in the high schools 12 00:00:26,970 --> 00:00:29,520 and middle schools as CS50 AP, a curriculum 13 00:00:29,520 --> 00:00:32,130 that they can adopt or adapt for their own classroom. 14 00:00:32,130 --> 00:00:34,800 Students meanwhile can take both before and after CS50 15 00:00:34,800 --> 00:00:38,050 itself, a whole suite of courses nowadays by CS50's team, 16 00:00:38,050 --> 00:00:41,010 including courses on artificial intelligence, game development, web 17 00:00:41,010 --> 00:00:42,990 programming, and more. 18 00:00:42,990 --> 00:00:47,760 All of these courses meanwhile are available at edx.org/cs50 for free. 19 00:00:47,760 --> 00:00:50,190 Now, over the years have we had to provide students 20 00:00:50,190 --> 00:00:51,982 with quite a bit of infrastructure in order 21 00:00:51,982 --> 00:00:54,440 to make this and now other courses possible, particularly 22 00:00:54,440 --> 00:00:56,940 toward an end to onboarding them to the world of programming 23 00:00:56,940 --> 00:01:00,420 without having to configure and install software on their own Macs and PCs, 24 00:01:00,420 --> 00:01:03,000 invariably then having technical support difficulties 25 00:01:03,000 --> 00:01:06,930 as opposed to focusing on the course's actual curriculum and the principles 26 00:01:06,930 --> 00:01:08,670 of programming itself at term starts. 27 00:01:08,670 --> 00:01:11,730 And in fact, if we roll back to 2007 at the time with students 28 00:01:11,730 --> 00:01:15,060 here only on campus being provided with shell accounts on a Unix, 29 00:01:15,060 --> 00:01:18,180 later, Linux cluster on campus for which they had a username and password. 30 00:01:18,180 --> 00:01:22,673 They had a home directory as well as access to commands like GCC and GDB 31 00:01:22,673 --> 00:01:25,590 because indeed the course introduces students to programming initially 32 00:01:25,590 --> 00:01:28,990 by way of C, followed by Python and other languages as well. 33 00:01:28,990 --> 00:01:31,950 But in 2008, we began to experiment with the cloud, 34 00:01:31,950 --> 00:01:33,540 which was in its earliest form. 35 00:01:33,540 --> 00:01:35,970 And by using AWS EC2 and other services did 36 00:01:35,970 --> 00:01:38,430 we replicate that same topology in the cloud so as 37 00:01:38,430 --> 00:01:40,740 to actually have more administrative and educational 38 00:01:40,740 --> 00:01:45,840 control over that same environment for just our students in the course itself. 39 00:01:45,840 --> 00:01:48,450 In 2012 meanwhile, just a few years later, 40 00:01:48,450 --> 00:01:51,090 did we begin to experiment with downloadable virtual machines, 41 00:01:51,090 --> 00:01:54,918 in our cases, dubbed the CS50 Appliance, a Fedora and later Ubuntu image 42 00:01:54,918 --> 00:01:57,210 that students could download on their own Macs and PCs. 43 00:01:57,210 --> 00:02:00,000 They could download something like VirtualBox or VMware 44 00:02:00,000 --> 00:02:03,630 in order to run their own CS50 environment on their own computer. 45 00:02:03,630 --> 00:02:06,690 Particularly enabling was this for the OpenCourseWare audience 46 00:02:06,690 --> 00:02:09,755 who, to date, could watch content, could read PDFs and the like, 47 00:02:09,755 --> 00:02:12,630 but they couldn't necessarily do the course's programming assignments 48 00:02:12,630 --> 00:02:16,260 or problem sets not having access of course to Harvard's shell accounts here 49 00:02:16,260 --> 00:02:16,980 on campus. 50 00:02:16,980 --> 00:02:20,188 But with the Appliance were they able to work not only on their own computers 51 00:02:20,188 --> 00:02:25,470 but even offline, engaging actively in the course's OpenCourseWare. 52 00:02:25,470 --> 00:02:29,520 But a few years later in 2015 did we begin to explore web-based alternatives 53 00:02:29,520 --> 00:02:32,520 as well to just get away from the process of needing students 54 00:02:32,520 --> 00:02:34,650 to install anything on their Macs or PCs, 55 00:02:34,650 --> 00:02:38,010 particularly in the context of schools, where teachers might themselves not 56 00:02:38,010 --> 00:02:40,560 have administrative access over the school's own labs. 57 00:02:40,560 --> 00:02:43,290 And so we began to experiment with an open source tool 58 00:02:43,290 --> 00:02:46,890 at the time, now AWS Cloud9, on top of which we developed 59 00:02:46,890 --> 00:02:50,220 a suite of pedagogical plug-ins so that students could use just a browser 60 00:02:50,220 --> 00:02:53,100 to access still a text editor, and a terminal window, 61 00:02:53,100 --> 00:02:56,610 and all of those same tools but this time requiring only a browser. 62 00:02:56,610 --> 00:03:00,120 And most recently now in 2021 have we begun to experiment as well 63 00:03:00,120 --> 00:03:02,550 with Visual Studio Code and its suite of extensions 64 00:03:02,550 --> 00:03:04,090 replicating that same environment. 65 00:03:04,090 --> 00:03:07,710 But we're also now aspiring to provide students with an offboarding experience 66 00:03:07,710 --> 00:03:11,190 as well, whereby students at terms end will now be able to download-- 67 00:03:11,190 --> 00:03:14,490 once they are comfortable and have their footing in programming itself-- 68 00:03:14,490 --> 00:03:16,920 VS Code and any requisite extensions in order 69 00:03:16,920 --> 00:03:19,170 to have their own independent programming environment. 70 00:03:19,170 --> 00:03:21,940 Because indeed if CS50 is the only proper programming class 71 00:03:21,940 --> 00:03:23,940 they ultimately take, they can at least continue 72 00:03:23,940 --> 00:03:27,670 developing even without any of the course's infrastructure thereafter. 73 00:03:27,670 --> 00:03:29,598 So in order to achieve all of these tools, 74 00:03:29,598 --> 00:03:32,640 we were motivated by trying to solve just these three problems-- enabling 75 00:03:32,640 --> 00:03:34,620 students to write code, enabling students 76 00:03:34,620 --> 00:03:39,030 to submit code, and enable teachers like ourselves to review that same code. 77 00:03:39,030 --> 00:03:41,400 And underlying then all of those problems 78 00:03:41,400 --> 00:03:44,490 were a number of problems already solved, for instance, authentication, 79 00:03:44,490 --> 00:03:47,820 authorization, code reviews, container orchestration, testing, 80 00:03:47,820 --> 00:03:49,420 version control, and more. 81 00:03:49,420 --> 00:03:52,560 And even though we have had for years, myself 82 00:03:52,560 --> 00:03:56,070 included, this tendency to try to create solutions 83 00:03:56,070 --> 00:03:59,010 ourselves and implement our own homegrown solutions so as 84 00:03:59,010 --> 00:04:01,618 to get these tools just right, that's of course intention 85 00:04:01,618 --> 00:04:04,410 with using something off the shelf, which, while already developed, 86 00:04:04,410 --> 00:04:06,700 might not do exactly what you want. 87 00:04:06,700 --> 00:04:09,780 But we've over time tried to find the sweet spot, whereby 88 00:04:09,780 --> 00:04:13,800 we use as much open source software or SaaS type tools as we can 89 00:04:13,800 --> 00:04:16,829 and then stitch them together using our own extensions, and plug-ins, 90 00:04:16,829 --> 00:04:20,422 and the like to create, ultimately, the UX that we want our students to have. 91 00:04:20,422 --> 00:04:22,630 And so in the case of these primitives, for instance, 92 00:04:22,630 --> 00:04:25,770 have we in recent years leverage GitHub itself for all of these and more. 93 00:04:25,770 --> 00:04:29,100 For authentication, we have students log in via OAuth with their own GitHub 94 00:04:29,100 --> 00:04:30,332 usernames and passwords. 95 00:04:30,332 --> 00:04:32,040 For authorization, we've leveraged GitHub 96 00:04:32,040 --> 00:04:34,890 support for teams so that TAs and students can have read 97 00:04:34,890 --> 00:04:36,840 and/or write access to the same repos. 98 00:04:36,840 --> 00:04:40,260 For code reviews, we've leveraged the web UI's interface for pull requests 99 00:04:40,260 --> 00:04:43,440 or commits on which you can type per line comments. 100 00:04:43,440 --> 00:04:46,080 Container orchestration now comes in the form of code spaces 101 00:04:46,080 --> 00:04:49,560 online, testing via GitHub actions, version control via Git 102 00:04:49,560 --> 00:04:51,030 itself, and more. 103 00:04:51,030 --> 00:04:53,910 And so within each of these environments over the years 104 00:04:53,910 --> 00:04:57,090 have we also developed a suite of command line tools and later web tools 105 00:04:57,090 --> 00:05:00,740 as well that leverage these primitives to solve more problems for students 106 00:05:00,740 --> 00:05:04,430 in order to uplift them in the experience of writing code 107 00:05:04,430 --> 00:05:05,210 on their own. 108 00:05:05,210 --> 00:05:08,570 Submit50, for instance, was one of the earliest of these tools, a command line 109 00:05:08,570 --> 00:05:11,870 tool that, at the end of the day, just let students submit their code using 110 00:05:11,870 --> 00:05:14,780 Git underneath the hood but without having to understand or wrestle 111 00:05:14,780 --> 00:05:18,262 with any of the mechanics of Git itself in that first week of the class, 112 00:05:18,262 --> 00:05:19,970 where they're still just learning, hello, 113 00:05:19,970 --> 00:05:22,520 world, in C or some other language. 114 00:05:22,520 --> 00:05:26,180 But for the backend then of submit50 are students really, at the end of the day, 115 00:05:26,180 --> 00:05:28,910 pushing to a repo in our own GitHub organization. 116 00:05:28,910 --> 00:05:33,560 And they reference it by way of a path, like owner/repo/branch/path that 117 00:05:33,560 --> 00:05:37,760 submit50 then uses to infer to where to push the code for students. 118 00:05:37,760 --> 00:05:40,580 In that repo meanwhile is there just a simple configuration 119 00:05:40,580 --> 00:05:45,200 file that flags that path as a valid submission path that students can use. 120 00:05:45,200 --> 00:05:48,890 And within that YAML file might there be the equivalent in YAML of something 121 00:05:48,890 --> 00:05:51,440 like a gitignore that prescribes exactly what files 122 00:05:51,440 --> 00:05:53,390 we do and don't want them to submit. 123 00:05:53,390 --> 00:05:55,520 Now, underneath the hood all submit50 is doing 124 00:05:55,520 --> 00:05:58,920 is essentially git add, git commit, git push, and the like. 125 00:05:58,920 --> 00:06:00,740 And even for resubmitting do we even avoid 126 00:06:00,740 --> 00:06:04,280 merge conflicts or the potential thereof altogether by doing a bit of trickery 127 00:06:04,280 --> 00:06:07,490 as well so that, again, for students early on these get completely 128 00:06:07,490 --> 00:06:08,510 abstracted away. 129 00:06:08,510 --> 00:06:12,110 But towards terms end, particularly when they start to collaborate potentially 130 00:06:12,110 --> 00:06:15,350 on final projects, can we begin to take away some of these training wheels 131 00:06:15,350 --> 00:06:17,880 and have them use the native tools themselves. 132 00:06:17,880 --> 00:06:21,500 Meanwhile, to distribute students own repositories and within it 133 00:06:21,500 --> 00:06:24,650 starter code, have we been using GitHub Classroom most recently, whereby 134 00:06:24,650 --> 00:06:27,830 students accept an assignment which has the result of copying 135 00:06:27,830 --> 00:06:31,850 a template repository for them into our own org that they and perhaps their TA 136 00:06:31,850 --> 00:06:34,280 then have read and/or write access to. 137 00:06:34,280 --> 00:06:37,760 Underneath the hood meanwhile do we then leverage web hooks or the equivalent 138 00:06:37,760 --> 00:06:39,860 thereof to use GitHub actions, whereby we 139 00:06:39,860 --> 00:06:42,500 export a prescribed format for GitHub Classroom 140 00:06:42,500 --> 00:06:45,650 in the form of an autograding.json file that actually automates 141 00:06:45,650 --> 00:06:48,657 the process of running a suite of correctness tests, which, in our case, 142 00:06:48,657 --> 00:06:50,990 happen to use our own command line tool but could really 143 00:06:50,990 --> 00:06:53,010 be any unit testing tool or the like. 144 00:06:53,010 --> 00:06:55,010 And for more qualitative feedback do we leverage 145 00:06:55,010 --> 00:06:57,050 the web UI to provide line by line comments 146 00:06:57,050 --> 00:06:59,300 as might be the case with a teaching assistant working 147 00:06:59,300 --> 00:07:01,355 more closely with their own student. 148 00:07:01,355 --> 00:07:03,230 And now in our case for those automated tests 149 00:07:03,230 --> 00:07:07,640 do we use our own tool, check50, which essentially just wraps a unit testing 150 00:07:07,640 --> 00:07:09,600 suite for C, and Python, and the like. 151 00:07:09,600 --> 00:07:11,900 But that too ultimately is backed by a GitHub repo. 152 00:07:11,900 --> 00:07:16,430 So a student would run a command like check50 owner/repo/branch/path that 153 00:07:16,430 --> 00:07:20,990 simply indicates where in the web to go get those correctness tests from 154 00:07:20,990 --> 00:07:24,110 in order to run them locally on the student's own code. 155 00:07:24,110 --> 00:07:27,393 Now, in the graphical context, we also have provided students and built 156 00:07:27,393 --> 00:07:30,560 on top of these primitives something we call see CS50 Lab, which essentially 157 00:07:30,560 --> 00:07:34,940 embeds a text editor and terminal window alongside a rendered markdown file, 158 00:07:34,940 --> 00:07:37,640 thereby enabling us and any teacher online 159 00:07:37,640 --> 00:07:41,105 to create in their own GitHub repository a learning experience for students. 160 00:07:41,105 --> 00:07:43,730 That might have some starter code, might have some correctness, 161 00:07:43,730 --> 00:07:46,910 and might also have a narrative alongside it that ultimately lives 162 00:07:46,910 --> 00:07:52,250 in a repository indicated by, again, owner/repo/branch/path inside of which 163 00:07:52,250 --> 00:07:54,890 too might be a YAML file for some minimal configuration, 164 00:07:54,890 --> 00:07:57,170 the readme file itself, and the starter code. 165 00:07:57,170 --> 00:08:00,200 And within that YAML file would just be a bunch of key value pairs 166 00:08:00,200 --> 00:08:03,380 that tell our tool how to configure the web environment exactly 167 00:08:03,380 --> 00:08:06,380 as that teacher wants for their own students. 168 00:08:06,380 --> 00:08:09,350 Also, graphically as well within these most recent web tools, 169 00:08:09,350 --> 00:08:11,510 like the IDE and now VS Code, have we wrapped 170 00:08:11,510 --> 00:08:14,750 the process of just starting the graphical debugger by a tool called 171 00:08:14,750 --> 00:08:19,220 debug50 so that when students compile and then run their code through debug50 172 00:08:19,220 --> 00:08:22,910 does it trigger essentially a launch.json file in VS Code 173 00:08:22,910 --> 00:08:26,033 to get dynamically generated and configured so that the debugger is 174 00:08:26,033 --> 00:08:28,700 right there up and running without students manually configuring 175 00:08:28,700 --> 00:08:29,870 anything themselves. 176 00:08:29,870 --> 00:08:33,020 If you're familiar with rubber duck debugging, the process of talking 177 00:08:33,020 --> 00:08:36,500 through your logical problems in hopes of hearing any illogic with a rubber 178 00:08:36,500 --> 00:08:38,870 duck or an inanimate object, similarly, have 179 00:08:38,870 --> 00:08:41,090 we built a VS Code extension that allows students 180 00:08:41,090 --> 00:08:43,909 to talk via chat with a virtual rubber duck 181 00:08:43,909 --> 00:08:47,090 giving them hopefully a bit of prompt for finding their problems. 182 00:08:47,090 --> 00:08:49,700 And then underlying all of these tools is not only Docker 183 00:08:49,700 --> 00:08:52,580 but now a devcontainer.json file, which allows 184 00:08:52,580 --> 00:08:54,650 us to specify a whole suite of configurations 185 00:08:54,650 --> 00:08:58,070 for students, like the extensions to preinstall, the Docker image to use, 186 00:08:58,070 --> 00:09:00,830 and the settings that can customize the graphical user 187 00:09:00,830 --> 00:09:02,780 interface they're ultimately using. 188 00:09:02,780 --> 00:09:05,510 Behind the scenes, meanwhile, do we use a suite of REST APIs 189 00:09:05,510 --> 00:09:07,350 in order to automate a lot of the processes, 190 00:09:07,350 --> 00:09:09,380 especially with regard to authorization. 191 00:09:09,380 --> 00:09:11,690 But ultimately, these and more tools are all 192 00:09:11,690 --> 00:09:14,660 freely available to students and teachers alike and documented 193 00:09:14,660 --> 00:09:16,730 at cs50.readthedocs.io. 194 00:09:16,730 --> 00:09:20,510 The course's themselves are all freely available at edx.org/cs50. 195 00:09:20,510 --> 00:09:24,070 And indeed, this was CS50. 196 00:09:24,070 --> 00:09:25,000