DAVID MALAN: Hello, world. My name is David Malan, and I'm here in Sanders Theatre at Harvard University where I teach CS50, Harvard's introduction to the intellectual enterprises of computer science and the arts of programming, an introductory course for majors and non majors alike. CS50 is among Harvard's largest courses, and indeed in healthier times, the seats behind me would look a little something like this. But the course is also freely available as OpenCourseWare as CS50x via platforms like YouTube, and edX, and beyond. And the course is additionally available to teachers in the high schools and middle schools as CS50 AP, a curriculum that they can adopt or adapt for their own classroom. Students meanwhile can take both before and after CS50 itself, a whole suite of courses nowadays by CS50's team, including courses on artificial intelligence, game development, web programming, and more. All of these courses meanwhile are available at edx.org/cs50 for free. Now, over the years have we had to provide students with quite a bit of infrastructure in order to make this and now other courses possible, particularly toward an end to onboarding them to the world of programming without having to configure and install software on their own Macs and PCs, invariably then having technical support difficulties as opposed to focusing on the course's actual curriculum and the principles of programming itself at term starts. And in fact, if we roll back to 2007 at the time with students here only on campus being provided with shell accounts on a Unix, later, Linux cluster on campus for which they had a username and password. They had a home directory as well as access to commands like GCC and GDB because indeed the course introduces students to programming initially by way of C, followed by Python and other languages as well. But in 2008, we began to experiment with the cloud, which was in its earliest form. And by using AWS EC2 and other services did we replicate that same topology in the cloud so as to actually have more administrative and educational control over that same environment for just our students in the course itself. In 2012 meanwhile, just a few years later, did we begin to experiment with downloadable virtual machines, in our cases, dubbed the CS50 Appliance, a Fedora and later Ubuntu image that students could download on their own Macs and PCs. They could download something like VirtualBox or VMware in order to run their own CS50 environment on their own computer. Particularly enabling was this for the OpenCourseWare audience who, to date, could watch content, could read PDFs and the like, but they couldn't necessarily do the course's programming assignments or problem sets not having access of course to Harvard's shell accounts here on campus. But with the Appliance were they able to work not only on their own computers but even offline, engaging actively in the course's OpenCourseWare. But a few years later in 2015 did we begin to explore web-based alternatives as well to just get away from the process of needing students to install anything on their Macs or PCs, particularly in the context of schools, where teachers might themselves not have administrative access over the school's own labs. And so we began to experiment with an open source tool at the time, now AWS Cloud9, on top of which we developed a suite of pedagogical plug-ins so that students could use just a browser to access still a text editor, and a terminal window, and all of those same tools but this time requiring only a browser. And most recently now in 2021 have we begun to experiment as well with Visual Studio Code and its suite of extensions replicating that same environment. But we're also now aspiring to provide students with an offboarding experience as well, whereby students at terms end will now be able to download-- once they are comfortable and have their footing in programming itself-- VS Code and any requisite extensions in order to have their own independent programming environment. Because indeed if CS50 is the only proper programming class they ultimately take, they can at least continue developing even without any of the course's infrastructure thereafter. So in order to achieve all of these tools, we were motivated by trying to solve just these three problems-- enabling students to write code, enabling students to submit code, and enable teachers like ourselves to review that same code. And underlying then all of those problems were a number of problems already solved, for instance, authentication, authorization, code reviews, container orchestration, testing, version control, and more. And even though we have had for years, myself included, this tendency to try to create solutions ourselves and implement our own homegrown solutions so as to get these tools just right, that's of course intention with using something off the shelf, which, while already developed, might not do exactly what you want. But we've over time tried to find the sweet spot, whereby we use as much open source software or SaaS type tools as we can and then stitch them together using our own extensions, and plug-ins, and the like to create, ultimately, the UX that we want our students to have. And so in the case of these primitives, for instance, have we in recent years leverage GitHub itself for all of these and more. For authentication, we have students log in via OAuth with their own GitHub usernames and passwords. For authorization, we've leveraged GitHub support for teams so that TAs and students can have read and/or write access to the same repos. For code reviews, we've leveraged the web UI's interface for pull requests or commits on which you can type per line comments. Container orchestration now comes in the form of code spaces online, testing via GitHub actions, version control via Git itself, and more. And so within each of these environments over the years have we also developed a suite of command line tools and later web tools as well that leverage these primitives to solve more problems for students in order to uplift them in the experience of writing code on their own. Submit50, for instance, was one of the earliest of these tools, a command line tool that, at the end of the day, just let students submit their code using Git underneath the hood but without having to understand or wrestle with any of the mechanics of Git itself in that first week of the class, where they're still just learning, hello, world, in C or some other language. But for the backend then of submit50 are students really, at the end of the day, pushing to a repo in our own GitHub organization. And they reference it by way of a path, like owner/repo/branch/path that submit50 then uses to infer to where to push the code for students. In that repo meanwhile is there just a simple configuration file that flags that path as a valid submission path that students can use. And within that YAML file might there be the equivalent in YAML of something like a gitignore that prescribes exactly what files we do and don't want them to submit. Now, underneath the hood all submit50 is doing is essentially git add, git commit, git push, and the like. And even for resubmitting do we even avoid merge conflicts or the potential thereof altogether by doing a bit of trickery as well so that, again, for students early on these get completely abstracted away. But towards terms end, particularly when they start to collaborate potentially on final projects, can we begin to take away some of these training wheels and have them use the native tools themselves. Meanwhile, to distribute students own repositories and within it starter code, have we been using GitHub Classroom most recently, whereby students accept an assignment which has the result of copying a template repository for them into our own org that they and perhaps their TA then have read and/or write access to. Underneath the hood meanwhile do we then leverage web hooks or the equivalent thereof to use GitHub actions, whereby we export a prescribed format for GitHub Classroom in the form of an autograding.json file that actually automates the process of running a suite of correctness tests, which, in our case, happen to use our own command line tool but could really be any unit testing tool or the like. And for more qualitative feedback do we leverage the web UI to provide line by line comments as might be the case with a teaching assistant working more closely with their own student. And now in our case for those automated tests do we use our own tool, check50, which essentially just wraps a unit testing suite for C, and Python, and the like. But that too ultimately is backed by a GitHub repo. So a student would run a command like check50 owner/repo/branch/path that simply indicates where in the web to go get those correctness tests from in order to run them locally on the student's own code. Now, in the graphical context, we also have provided students and built on top of these primitives something we call see CS50 Lab, which essentially embeds a text editor and terminal window alongside a rendered markdown file, thereby enabling us and any teacher online to create in their own GitHub repository a learning experience for students. That might have some starter code, might have some correctness, and might also have a narrative alongside it that ultimately lives in a repository indicated by, again, owner/repo/branch/path inside of which too might be a YAML file for some minimal configuration, the readme file itself, and the starter code. And within that YAML file would just be a bunch of key value pairs that tell our tool how to configure the web environment exactly as that teacher wants for their own students. Also, graphically as well within these most recent web tools, like the IDE and now VS Code, have we wrapped the process of just starting the graphical debugger by a tool called debug50 so that when students compile and then run their code through debug50 does it trigger essentially a launch.json file in VS Code to get dynamically generated and configured so that the debugger is right there up and running without students manually configuring anything themselves. If you're familiar with rubber duck debugging, the process of talking through your logical problems in hopes of hearing any illogic with a rubber duck or an inanimate object, similarly, have we built a VS Code extension that allows students to talk via chat with a virtual rubber duck giving them hopefully a bit of prompt for finding their problems. And then underlying all of these tools is not only Docker but now a devcontainer.json file, which allows us to specify a whole suite of configurations for students, like the extensions to preinstall, the Docker image to use, and the settings that can customize the graphical user interface they're ultimately using. Behind the scenes, meanwhile, do we use a suite of REST APIs in order to automate a lot of the processes, especially with regard to authorization. But ultimately, these and more tools are all freely available to students and teachers alike and documented at cs50.readthedocs.io. The course's themselves are all freely available at edx.org/cs50. And indeed, this was CS50.