BERNIE LONGBOY: Good afternoon, everyone. Welcome from Cambridge, Massachusetts. Welcome to today's seminar on collaboration and version control with Git. Today we have Tarun Prasad, a junior and computer science major and one of our CS50 teaching fellows. I'm Bernie Longboy, one of the staff members and your moderator for today. Welcome, and thank you, to Tarun. 

TARUN PRASAD: Thanks so much, Bernie, and thank you all for joining. Very good afternoon to all of you. So like Bernie said, the topic that we'll be talking about today is using this command-line tool called Git for version control and collaboration. 

So before we start talking about Git is or what we can do with it, consider the following two scenarios. The first scenario is that you're working on a CS50 problem set, say Finance. And you're able to get some portions of the P set working perfectly, so things like register, and quote, and buy. But you're confused as to how to complete index and display a table of stocks that the user owns. 

And while attempting to make more progress, you accidentally mess up the working code that you already had. And so now all that you see are internal server errors everywhere, no matter what you try to do. So in hindsight, what are some things that you could have done to prevent this? And feel free to use the chat to send in your answers, and we can talk about some of them. 

What are some things that you could have done to prevent this from happening? How could you have? You could have used some sort of version control, the version control. Yes, use Git. You could have made some backups. Some of you are suggesting using some sort of backup, so that could be a good idea. 

You could have copied over your entire codebase, zip it into a ZIP file, store it on your desktop. And full disclosure-- I've definitely done this in the past. You could absolutely do that, and maybe you would do that once every day. At the end of every day, you just create a ZIP backup of your entire code space. The main problem with doing something like that is, especially if your codebase is really large-- imagine you have a really large number of images and videos within your codebase-- you would have to make multiple copies of that. And so this can very quickly become very large and take up a lot of space unnecessarily. 

Now, one of you is suggesting that you can take a note of the changes that we made. So that could be another really good idea. Maybe we just open a Google Doc. And maybe every single day, we write down, OK, I added lines 23 to 30 as these specific lines of code. I deleted line 15, and then I modified line 14 by adding a semicolon at the end. And that could be another really useful way of managing this sort of a situation, where we don't actually use up as much space. So thank you for the ideas, and let's go on to scenario two. 

So let's say you decide to team up with a friend for your CS50 final project, and you want to build a web app. The two of you are trying to decide how to split up tasks. You really want to design the home page, using HTML and CSS, including the fonts, the styles, the colors, the text, et cetera. But your friend really wants to implement the sign-up and log-in workflow, so the HTML form, as well as some of the JavaScript, as well as the Flash backend. 

So given that you both want to edit the same files, but in different ways, what are some ways to deal with this? And again, feel free to send messages in the chat, and we can talk through them. 

You can use an IDE that has real-time collaboration. Yeah, that absolutely works. Communication before doing any changes. Some of you are suggesting Git branches, which is a really good idea, which we'll come to later today. Any other thoughts? Git branches? 

Using Replit. Yeah, using some sort of a collaborative code writer. Maybe even in the worst case, Google Docs, maybe copying over your entire file into Google Docs, and manually spacing everything and typing everything until everything looks OK. That might work. 

But I mean, some other ideas might also be using something like-- or just taking turns sitting at the same device. And maybe one person types code first, and then the other person takes over. And again, you can see that many of these have disadvantages. Either these ideas are somewhat slow, or maybe two people work on the two different things simultaneously. And then one person sends over their changes to the other person via email, and the other person then merges them in by going through it line by line and seeing what changed. 

Yeah, so thanks again for the great ideas. A lot of these involve a lot of manual work and can often be slow or time-consuming. And that's where this command-line tool called Git comes in. 

Git is essentially a command-line version control system, which means it automates a lot of the ideas that we just spoke about, storing differences between different versions of your codebase, or allowing collaboration, allowing different people to work on the same files in different locations, and enabling you to merge them together in a very easy and straightforward way. 

So in Git, each project is stored as a repository. And each saved version of the repository is called a commit. A local repository on your device can be associated with other repositories hosted elsewhere, like on GitHub and GitLab. And these repositories hosted on other servers are called remotes. 

So your repository consists of both a history of the commits that you made, so all of the previous versions of your codebase, as well as the latest version. Each commit stores the changes made since the previous commit, and is identified through a commit hash. And so I'm sure most of you have heard of hash functions, and so that's essentially what's used over here. Each commit stores what changes you've made since the previous commit. And then that information is somehow hashed to give you a hexadecimal number. And any time you want to refer to a commit, you can identify it, using this commit hash. 

The normal directory structure on your local system, along with the files it contains, as opposed to previous versions, is called a working copy. The set of tracked changes that will be saved in the next commit is called the staging area. 

And so this is a diagram from another class called CS61. And this gives you an example of what the repository contains. It contains what's called the version repository, which consists of all of the commits that you made previously, so all of the previous versions of your codebase, along with the version that you are working on right now, called the working copy. 

So I know these were a lot of terms just forced onto you, but hopefully, this will be useful in understanding how Git works. And we go through the details of how exactly to use Git next. 

A quick note on the installation. If you're using Codespaces, like you do for your CS50 problem sets, then you don't actually have to install anything. Git should come pre-installed. If you do want to develop locally, then you can install it, using this link on Windows. Or in MacOS, it should probably be installed, but if it's not, then you can run git --version and it should prompt the installation. These links are in the slides, so you can definitely download the slides from the CS50 website, and you can download Git if you want to. 

Now let's go on to the Git commands. The first set of commands that are useful to know are the getting-started commands. So how do you get started with a Git repository? There are a couple of ways. And which one you use depends on whether you're working with an existing repository that someone's already published, or if you're creating a new code space completely from scratch. 

For your final project, you probably will use the latter because you will be starting a new project from scratch. And in such cases, you will use the command, git init, which means just initialize the folder that you're in right now as a Git repository. 

On the other hand, if you want to download an existing remote repository-- for example, something that you found on GitHub for whatever reason-- then you can use git clone. I show you a demo of all of this at the end. But the idea is to just copy the URL. You git clone URL, and then it'll download the entire repository and all of the files and directories into the folder that you're in right now. 

The next set of commands that we're going to go over are for saving changes. And these three commands, in my opinion, are by far the most important commands in today's seminar. So even if you don't get anything else out of this, I want you to take back this one slide. 

The first one is git add, which specifies which files to track. And so it tracks these files by adding them to what we call the staging area. And like we mentioned before, the staging area is what controls the files and changes that will be saved in the next commit that you make. 

So you can specify, for example, git add hello.py, or git add mario.c. And you can list out the files, separated by spaces, and this will track those files by adding them to the staging area. So the next commit that you make will then save the changes that have been specified by git add. So for example, if you change two files, but you only git added one file, then only that one file will be saved in the next commit. And so this is useful for specifying exactly what goes into each commit. 

The second command for saving changes-- and this is, arguably, the most important-- is git commit, which actually creates the commit by saving the track changes in the form of a commit. The -m flag specifies a commit message, and so you can say git commit, -m, and then whatever commit message that specifies what changes you made. So for example, this could be something like fix x bug, or implement y feature, or something like that. 

And finally, the last saving-changes command that you need to know is git push. So technically, git push is an optional command, because if you're working entirely locally, if you don't have a remote associated with your Git repository, then you can absolutely just do git add and commit, and everything will be saved in your device. But if, for whatever reason, let's say, you want to collaborate with someone and you want somebody else to have access to your git repository, then you would add a remote, maybe on GitHub or something, and push your commits, your changes, onto the remote repository. And you would do that, using git push. 

OK. And if you have any questions, please send them in the chat, and Bernie will stop and let me know if there are any questions. 

Undoing changes. The most important command for undoing changes is git revert, which lets you undo a commit by essentially creating a new commit that does exactly the opposite of what the incorrect commit did. 

So for example, if you decided to make some design changes to your web app that you were building, and so you add a bunch of CSS, maybe you create a new CSS file. And you add a bunch of CSS that specifies the fonts, and the styles, and the colors, and stuff like that. But then you decide to talk about this design with the other people in your group, and then you realize that, OK, maybe this isn't the best way forward for your website. And so you want to undo the changes. 

So one way to undo it would be to manually go back and see what changes you made, and then manually delete those lines and each of the files that you made the changes in. But the advantage of using Git is that, because each of these sets of changes is stored in a commit, you can directly undo all of those changes by just doing exactly the opposite of what that previous commit did. 

Now, how do you specify this commit in the command? So like I said earlier, each commit has associated with it a commit hash, which is basically a big hexadecimal number. And so you can specify this commit by using the commit hash. In fact, you don't even need to specify the entire commit hash. Usually, the first, say, five or six digits will suffice, because that's enough to uniquely identify this commit. 

You can also specify the commit by using the HEAD~n syntax. HEAD represents the latest commit, so HEAD~n would be the nth last commit. So let's say you wanted to get rid of the commit that you did five commits ago. Then you would just say HEAD~file. And that's how you undo changes, using Git. 

And there are some other general, useful commands-- git status, which displays the state of the repository and the staging area, git log, which displays a log of the previous commits in the currently checked-out branch. And we'll talk about branches in a bit, but just think of this as a log of the previous commits that you made. 

And finally, git diff displays the changes that you've made. So for example, if you just do git diff, then it'll show you-- using the appropriate colors, green or red-- what changes that you've made since the last commit. You can also specify the commit hashes of new commits to ask for the changes between two commits. So maybe you want to go back and look at what changes that you made in a specific commit. You just specify git diff, the first commit hash, the second commit hash, and it'll show you whatever changes you made. 

BERNIE LONGBOY: I'll go ahead and read a couple of questions, Tarun. So first is, do I need to download anything on Mac to have Git? Actually, you don't, correct? Is that correct? 

TARUN PRASAD: You don't need to download, but you can double-check that you have it installed, using git --version. So just run this in your command line, and it should probably say something like Git Version 2.25 or something. 

BERNIE LONGBOY: Yeah. 

TARUN PRASAD: And-- 

BERNIE LONGBOY: And I think you answered this one, but does GitHub count with a visual aid to understand how many commits we're using? I'm finding the timeline branches a little confusing. 

TARUN PRASAD: Yeah. I mean, both Git and GitHub do have a list of commits. You can do this locally without even using GitHub, using git log. And so that'll give you a list of the commits that you made, along with a timestamp of when you made the commit, who made the commit, and other information, what changes were actually made in that commit. About branches, we will talk about that in about 5 or 10 minutes, after this demo. So hopefully, that will clear things up a little more. 

BERNIE LONGBOY: OK. Those were-- 

TARUN PRASAD: OK, awesome. 

BERNIE LONGBOY: --good questions they had. 

TARUN PRASAD: Yeah. We will go ahead to a demo right now, so that I can just show you the basics of what we covered. And then we'll come back and talk a little bit more about some of these commands. 

OK. So what I'm going to do here is, I am going to create a new folder, which I will call Git Seminar. This is just CS50 Codespaces, so feel free to pull it up and follow along, if you like. You may have to create this folder in the workspace directory, so you might have to go up one directory to avoid messing up CS50's own Git commands or the Git repositories that they have. But feel free to follow along, if you'd like. 

So I have what I call Git Seminar, so presumably, you could be using this folder for keeping track of your final project. Go ahead, whatever work that you do for your final project. So let me cd and do git-seminar. And let's see. This is currently an empty folder. So what I'm going to do now is, I'm going to create a file called hello.py and open that up. And let's say I do print Hello World. 

So let's say this represents some set of initial changes that I made to my codebase. And now I realize, OK, this might be a good point to stop and save my work, using a commit. So right now, this codebase is just a directory. It doesn't have a Git repository associated with it. So how do I get started? How do I begin by initializing this repository as a Git repository? And feel free to send in what you think in the chat. git init. Yes. 

OK. So it says, initialized empty Git repository, in the folder that I'm in right now. So before I do anything else, let me just run git status, which is one of the final commands that we saw. And this tells me the status of the repository. It says that there are no commits yet. And there is one unchecked file called hello.py. There's also nothing that's been added to the commit, but untracked files are, in fact, present. 

So now that I have this information, how do I begin by creating a commit? What do I need to do to actually set up before I can create a commit? And in fact, the git status output itself gives you some advice. You can use git add. Yes. So I'm going to do git add. Now, there are a couple of ways I can do this. I could do git add hello.py to specify or delegate that I want to track this one file called hello.py. 

But very often, you might have made changes across multiple files, and maybe you just want to track everything. And so another thing that you could do is just git add dot, which would add everything in the current folder. Dot just refers to the current directory, the one that you're in right now. And so git add dot just adds everything in the current directory, all the untracked changes, into the staging area. 

Let me run git status again really quick. There's still no commits, but now it is tracking this one file called hello.py. And this is the change that will be committed in the next one. And so like somebody sent in the chat, the next command is just git commit -m, commit message. And so let me say hi. OK, and doing this commit then returns some output. These few digits that you see over here are the first few digits of the commit hash. And you can see the commit message over here, the author, and other information about what the changes you made were. 

Let me run git log, just to show you what the output of this looks like. Git log specifies a list of the previous commits that you made. Right now, there exists only one commit, and this is the entire commit hash of that commit that we just made. This is the commit message, say hi. And it also specifies the author, their email, the date, and so on. 

OK. So that hopefully gives you an idea of how to add in commit, but now let me try the very last step of saving changes, which is git push. But if I try to do that, it gives me an error. It says, fatal. No configured push destination. Either specify the URL from the command line, or configure a remote repository, using git remote add. So the reason for this is that, so far, everything that we've done lives entirely on our own device, on our local codebase. So far, we haven't associated this with anything on GitHub at all. 

And so what I'm going to do right now is exactly that. I'm going to show you to create a GitHub repository and then associate the two things so that whenever you make commits, you can then push them to the remote repository. The main advantage of doing this is that, one, if your own device crashes, like if you're developing locally, but your device crashes-- maybe your hard drive fails-- you will still have a backup of everything stored on GitHub. 

And perhaps more importantly, storing it on GitHub will allow other people, maybe your team members, to look at it. Or maybe if you're working on an open-source project, maybe somebody else looking, Googling something, might end up on your repository and may want to use your code, or download or contribute to it. 

So how do we create a repository? So just go on to GitHub.com, and again, feel free to follow along if you want. You should see a New Repository button on the left sidebar, so just click on that. I can specify a repository name. This can be anything. In my case, I'm just going to call it git-seminar. You can choose whether you want to release it publicly or you want to keep it private. I will choose private in this case. And I'm going to leave all of this unchecked, and I'm going to click Create Repository. 

OK, so now it's created this repository. You can look at the URL. It has my username, followed by git-seminar. And it also provides some setup commands, which tell you how to link the two repositories together. And again, there are a couple of ways of doing this, depending on whether you're creating a new repository or linking an existing one. In our case, we're just pushing an existing repository from the command line because we already have a Git repository set up locally. You would follow these commands if you wanted to start one from scratch, but in our case, we just need to do these three things. 

While you're doing this, make sure you choose the SSH option if you set up with SSH, which most of you on Codespaces probably have. But the first command that you need to put in is the command that adds this GitHub URL. And let me paste that in here. So it adds this remote, which I'm calling origin, and it's linking it to this URL on GitHub, github.com slash my username slash git-seminar.git. So now the two repositories have been linked, and now I can specify the name of the main branch. And again, we'll talk about branches in a bit, so don't worry about this right now. 

And finally, I can do a git push. Now, if I just do git push on its own, it again will complain because GitHub doesn't already know there exists this branch called main. And so the very first time that you push, you'll have to do the set-upstream flag. You have to use the set-upstream flag. And so I will copy that in, paste that in, and hit Enter. And that's about it. Once it's done pushing, it'll tell you that a new branch has been created on GitHub. And this branch has been set up to track this remote branch called main from origin. And again, origin is the name of the remote, which is the remote on GitHub. 

So let me go back to the GitHub repository and refresh this page. And now you can see that it looks more familiar, like something that most GitHub repositories usually look like. It gives you a list of whatever folders and files that exist in your repository, along with the commit history. And so clicking on one commit, for example, will show you a list of commits that you made so far. And you can then click on any specific commit to have a visual representation of the changes that you made. And in this case, the only change that I made was to add this one line that says, print Hello World. And now I can go back to Code. I can also see I can open up any file and look at the existing version of that file on GitHub. 

OK. Again, if you had any questions about that process, feel free to send it in the chat, and we can go over it in more detail. 

BERNIE LONGBOY: So Tarun, one of the questions early on was about-- although I believe one of our audience-- difference between GitHub and git clone. I think you might have mentioned that, if you want to just go over that again? 

TARUN PRASAD: Yes. So I guess I'll talk about the difference between Git and GitHub first. Git is a command line tool that lives entirely locally on your device, or in this case, it's on Codespaces. So Git is just the name of the command-line tool itself. And GitHub is essentially the name of a website or a company which provides some services, which lets you store these Git repositories on their server. 

And so when you do git clone, git clone is a very specific Git command which lets you download an existing GitHub repository, because GitHub repositories don't live on your device unless you created them. So maybe if you wanted to look at the GitHub repository of a classmate, maybe a classmate has been working on a personal project, and maybe you want to check it out. Maybe you want to look at the code. 

So what you can do is then is go onto the GitHub.com link, the URL, and then use git clone along with this URL. So if you click on the green Code button on GitHub, it'll specify a URL for the repository. And again, it's preferable to use SSH. And you can basically download this, using the command, git clone, so git clone, followed by the URL. 

BERNIE LONGBOY: OK. Next question is from Daniel. Just to be clear, the command line refers to the Terminal on MacOS and PowerShell on Windows, correct? 

TARUN PRASAD: Yes. On Windows, you can use either PowerShell or Command Prompt. But on Mac, it is the Terminal, yes. 

BERNIE LONGBOY: OK. And Amy asks, so will our files be saved on our computer and GitHub, or only on GitHub? 

TARUN PRASAD: Yes, that's a really good question. And again, that brings up the distinction between using-- 

BERNIE LONGBOY: Local. 

TARUN PRASAD: --git locally and pushing things to GitHub. So as long as you just do git add and git commit and git add and git commit and make changes, it's only going to be saved on your computer. But as soon as you do git push, after you add the remote and the URL and everything, then all of your changes are going to be pushed to GitHub. And then it'll be stored on both the local device, your computer, and on GitHub. 

BERNIE LONGBOY: And then this question came up from Austin, and I believe Anjalee also asked it. Why SSH and not HTTPS? 

TARUN PRASAD: Yeah, also a good question. So far, I think GitHub has been allowing both HTTPS and SSH as possible ways of downloading code and authenticating. So essentially, they're both means of authenticating yourself to allow you to, for example, push to a repository or to download a repository and so on. But recently, I think GitHub has decided for security reasons to switch entirely to SSH. It's supposed to be more secure than using HTTPS. So that's the reason we're also pushing towards SSH these days as well. 

OK. If there are no other questions, we can continue. 

BERNIE LONGBOY: Let's see. What is SHH and HTTPS? 

TARUN PRASAD: Yeah. So SSH and HTTPS-- I might have mentioned this before-- are essentially ways of authenticating yourself. And they specify different protocols for authentication as well as downloading, and pushing forward, and so on. 

BERNIE LONGBOY: And can we just take this last one, and then we'll go into the next part? And then I will come back to the questions again. We could see your first line of code in the first commit. Did you create the file with no code in it and pushed it, or with the included line of code? And that was-- 

TARUN PRASAD: So I-- 

BERNIE LONGBOY: --from Bruno. 

TARUN PRASAD: Yeah. I created the file, and then included the line of code, and then pushed the two things together. And so that was the order I did it in. I created the file, included the line of code, then did the whole git add, git commit, git push. And then that change, the change where I essentially created the file and added the line of code, was represented on GitHub. 

BERNIE LONGBOY: OK, Tarun. 

TARUN PRASAD: OK, awesome. So let's go on to the next portion of today's seminar. And this is the main portion which actually focuses on collaboration. So so far, we've seen that we can save changes, and track changes, and undo commits, revert changes, and so on. And that's all well and good when you're the only person working on your repository. But in most of your projects, you're probably working with a partner or with a team member. And in such cases, it's very useful to know of ways which will help you collaborate together, pushing different code to different files, so even modifying the same lines of code in different ways. And Git allows you to do all of this in a very seamless way. 

The first command that's very useful for this is git pull. So git pull downloads changes and commits that have been pushed by others to the remote, and merges them into your own local repository. And this is very useful when collaborating with others because you won't be the only one pushing code. 

So for example, let's say you're the one who did the git init, you made some changes, you added the remote, you created the GitHub repository, and you push all of your changes. But in the meanwhile, let's say that your group member decided to go onto your GitHub URL, your GitHub repository. They decided to do a git clone, so they downloaded all of the files that you pushed, and then they made their own changes. And they also did the whole git add, git commit, git push, and now their changes have also been pushed into the GitHub repository. In such a case, GitHub knows about these changes because they pushed it. But the local version of the code that lives in your device or that lives on Codespaces doesn't know of those changes because all of these changes are only on GitHub. 

But how do you sync your version of the code, your local version of the code, with the changes that have been pushed by others? Well, you use git pull. So what git pull does is it downloads all of these changes and then merges them, adds those commits on top of whatever commits you already made, and then merges them together. So technically, git pull is actually a combination of two other commands called a fetch and a merge. We won't be talking about that in too much detail today, but do feel free to look that up if you want to learn more about this. 

OK, coming to possibly one of the more important topics and very useful topics of today's seminar, branches. And there were some questions earlier about branches as well, so hopefully this will clear things up a little bit. So what exactly are branches? Like in the XKCD comic at the very beginning, it's really pretty simple. Just think of branches as pointers to commits. 

So what does that mean? I think this illustration will help clear things up a little bit. Each of these hexadecimal numbers that you see in these white boxes, each of those represents a commit. And each commit points to the previous one. So maybe the 98ca9 commit was made first, and then the 34ac2 commit, then f30ab, and then 87ab2. But you also see these other red boxes, which point at commits. And these red boxes are essentially what branches are. 

In most cases, you'll have some default branch, usually called something like main. But you can also create other branches, which will be very useful when testing experimental features or when fixing a bug, because you don't want to mess up the main branch. So for example, if you have a working version on your main branch, which other people who are also working on this project are using, then you don't want to push potentially buggy commits onto the main branch. 

What you can do instead is create a separate branch. I will see the commands to do that in a minute. But you can create a separate branch, commit to that branch, and then push those commits to that branch on GitHub. And so the main default branch will be unaffected, and anybody else can also clone the repository and pull code and all of that onto their own main branch without it affecting anything. But then once you're completely done with your feature, once you're sure that everything works in your feature branch, then you can decide to merge it back into the main branch. And in fact, this is a very common workflow, as we will see in a second as well. Any time you work on a big project, especially when you're collaborating with many other people, this is a very common workflow that you'll use. 

Now, what are the actual commands to work with branches? The first one that's very useful is, how do you create a branch. So for example, let's say I want to create a branch called feature. Then I can do that by running the command, git branch, followed by feature or whatever branch name you want to specify. So git branch feature creates that branch and makes the branch point to the commit that you're at right now. So for example, let's say that the latest commit that I made was this f30ab commit. And that's what main is pointing at because that's where the main branch was. But now, if I create a new branch by using git branch feature, then feature will also point to f30ab over here. 

But any further commits that I make will be associated with the main branch and not the feature branch, because git branch doesn't actually change or switch which branch you're in right now. If you do want to switch the branch as well, then you use what's called git checkout, so git checkout with the -b flag, followed by the branch name. -b just specifies that you're creating a new branch. So git checkout, -b, followed by feature, and that'll do the same thing, but also, any further commits that you make will be associated with the feature branch and not the main branch. 

Once you've created a branch, each time you create a new commit, the pointer will also advance, along with the commits themselves, the pointer of whatever branch that you're in right now. So for example, if I'm in the feature branch and it's pointing to the f30ab commit, then creating a new commit will essentially result in what it looks like right now. It'll create this 87ab2 commit, and then it'll also move the feature branch to point at the latest commit. 

Now, notice that the main branch pointer doesn't actually move, and that's because you're not in the main branch. And so if you go back and look at the main branch, it'll still look exactly like what the repository looked like when you last made the f30ab commit. And so the changes in this last 87ab2 commit will not actually be reflected in the main branch, and that's exactly what we want because we don't want to mess with the working version of the code in the main branch. 

You can also switch between branches, using just git checkout without the -b flag. And you can just do git checkout, followed by the branch name, and that'll change the branch you're in right now. You can also push a new branch to GitHub, and we saw this earlier with the main branch. But the way you do it the very first time is, you use this -u or this set-upstream flag, which tells GitHub that a new branch is coming in. And so you can use git push, -u, origin space, the branch name. So hopefully that makes sense, but again, we'll see a demo of this in a bit. 

This is a term that you would have heard of very often, any time you look at a GitHub repository, or maybe if you've done an internship in the past, or something like that, pull requests. So pull requests are not a Git feature, which means they're not something that you would work with on the command line. But rather, they're a feature that's specific to GitHub, but very useful in collaborative repositories. And so other remote providers like GitLab or Bitbucket might also have analogous features, but possibly with different names. I think GitLab calls it a merge request. 

So a pull request is essentially a request to merge in the changes that you've pushed to a branch back into the main branch, or any other branch, but usually the main branch. So for example, once you've completed the changes that you want to make in your feature branch, you can push your feature branch, then go into GitHub, and you'll see a Pull Request tab. And over there, you can click on the Create Pull Request button, and then choose which branch you are merging from and which branch you're merging into. And then what that will do is it'll create a request to merge those changes in. 

And so a very common workflow when working in a group is to always commit to a feature branch and never directly to main, and then to push the branch to the remote, open up a request, have somebody else in the team review the pull request, and either approve it or request changes. And then once the changes have been approved, you can then merge it into main, and you can usually do this directly on GitHub itself. And again, we'll see how exactly to do that in a minute. 

BERNIE LONGBOY: Tarun, is this a good break to take-- 

TARUN PRASAD: Yeah. 

BERNIE LONGBOY: --a couple of questions from the audience? 

TARUN PRASAD: Yeah, we can take-- 

BERNIE LONGBOY: OK. 

TARUN PRASAD: --those questions. 

BERNIE LONGBOY: All right. So, our fantastic audience, I did not forget you here. So I am going now to-- let's see. I think, Leo, we answered yours. Bruno. This question comes from Bruno. I'm trying to use Git in VS Code, but it appears you are in a repository managed by CS50. Git is disabled. So that's not really a question, but the more I think, is that problem still going on? 

TARUN PRASAD: Yes. If you face that, one thing that I would suggest is just going up one folder, so maybe using cd.. And then you'll be in your workspaces folder. And then you can essentially create a new folder altogether, so make something else, like make the project or something. 

BERNIE LONGBOY: This is-- 

TARUN PRASAD: And once-- 

BERNIE LONGBOY: --something else-- 

TARUN PRASAD: --you do that-- 

BERNIE LONGBOY: --that our students always-- so it's common, common. 

TARUN PRASAD: Yeah. 

BERNIE LONGBOY: OK. Let's see. What else do we have here? Let's see. From Leo, so is commit 87ab2 just a copy of f30ab to create a new branch with feature, or are they actually based on different versions? 

TARUN PRASAD: Yeah, good question. So let me go back to that slide. So when you create a new branch, there are no new commits that are created. So the first time that you create the feature branch, it will still point only at the f30ab, and this 87ab2 commit doesn't exist yet. What happens is that if you're on the feature branch, and you then do a new commit, like you actually make some changes and you actually do git commit, that's when this 87ab2 commit might be created. And then the feature branch moves forward to point at that commit instead of the original commit. 

BERNIE LONGBOY: OK, and this one is just a good one to probably clarify, Tarun. I did it in a direct message, but it's probably good for the entire audience to know. git clone is a part of GitHub. I believe the question was, do we require an account on git clone, like we need a GitHub account, or we are just using it online? So it's really the same account, but I'll let Tarun go ahead and maybe further elaborate on that one. 

TARUN PRASAD: Yeah. If I understood that question correctly, you don't generally need a GitHub account to do a git clone. You should be able to usually just do git clone if it's a public, open-source repository. But you may need a GitHub account if you want to clone a private repository that only you have access to. Maybe if it's a repository that your project partner created and you decided to make it private, then you would need to log in, authenticate using GitHub, and then once you do that, you can then do git clone, whatever URL, and that should download everything. Let me know if I didn't answer the question correctly, but I will try again. 

BERNIE LONGBOY: This question is from Sarah. Actually, she asks, could you please repeat what branches do? 

TARUN PRASAD: Yes, of course. Branches are one of the more confusing features in Git. So branches, again, are essentially pointers to commits, and the point of using branches is to separate different things that you're working on currently. So for example, let's say that, simultaneously, you have some existing working version of your entire codebase. So let's say you're working on a web app, and you have some preliminary version ready to go and working on the main branch. And you're afraid that any time you push some commits to that, you might mess it up, things might stop working, and so don't want to push anything to the main default branch. 

So what you can do to get past this is, essentially, create a new branch and do all of your work in this experimental branch. So think of this as sort of copy-pasting your entire codebase in a completely different directory and just messing with whatever changes that you want to make, whatever box you want to fix in the copied, duplicate version. And then, once everything is working, then you go back to your original directory and copy over the changes. That's essentially what you're doing with the branches workflow. 

BERNIE LONGBOY: And just to keep things moving forward, I just want to reiterate that we will have a recording available, so the questions that were answered previously. And let's see. Let's see. Oh, thanks, Anais. She just put in a great resource there. OK. Tarun, do you want to continue? We're in our last-- 

TARUN PRASAD: Yeah. 

BERNIE LONGBOY: --few minutes here. 

TARUN PRASAD: Thank you. We have about 10 minutes left. So I'll talk through this very quickly, but because this is something that many people often have trouble working with when they're first doing Git. 

So usually, when you try to pull code or try to merge in one branch with another, it will work completely fine, especially if the changes are in completely different portions of the codebase. They're completely different files, for example. But occasionally, you'll encounter that you've made some changes to a specific file, but maybe somebody else has also made some different changes with the same lines in those files, but in some sort of a different way. So Git in such cases won't know how to automatically pull the code or to automatically merge them. And this results in what's called a merge conflict. 

So this is a code sample from, I believe, CS61. CS61 also has a very useful Git tutorial available online, so you should definitely check that out, if you can. And this is a code sample from there. And you can see that there are two very similar lines over here. This pointers of i equals malloc of i or 1, and pointers i equals malloc of i plus 1. 

So in particular, the version of the code between these left-angle brackets and the equals is one version. And the incoming version is what's between the equals and the right-angle bracket, so essentially between these two lines and between these two lines. And so when you encounter a merge conflict, it will essentially insert these angle brackets and equals lines into whatever files have those merge conflicts. And then it's up to you to manually resolve them. 

And how you do that is actually very simple. You just get rid of these three lines, these three angle-bracket, equals, angle-bracket lines. And then you decide which of the two versions of the code to keep, or in some cases, you might want to merge them together in some more elaborate way. And this is usually a good point to talk to the other person who made the other changes and decide, OK, how exactly should we merge these two changes together, like why did you make those changes, why did I make those changes. Maybe the actual one we want is some sort of a combination of the two. 

And if you're using an editor like VS Code or Codespaces on the browser, then it should also have very convenient graphical UI elements that make this even easier. So it'll show you the changes, using the appropriate colors, green or red, and then you can choose one or the other, depending on which one you want. And once the conflicts have been resolved, just commit your resolved changes, using git commit as usual. 

OK. Let's very quickly go through a quick demo of this workflow as well, just so we know what it looks like. So let's say, hypothetically, I'm going to clear the commands. So let's say that I'm not the person who made the original commits, and maybe I'm somebody else in the same group. So maybe I just did a git clone. Maybe I just downloaded the entire codebase. Maybe I do a git log to see the status right now. I see that it's exactly one commit. 

So let's say that I want to push another commit. So let's, for example, make some changes here. Before I do that, I don't want to push these changes to the main branch. And so what I want to do instead is make the changes to some different branch. So let's say I do git checkout -b, followed by whatever branch name. In my case, I'm going to call the branch name add-comments. So I switched to a new branch called add-comments, and maybe the changes that I want to make involve just adding a comment. So maybe I say, say hello. 

And now let me run git status, just to show you what the status is right now. So now I'm on branch add-comments, not on branch main. And again, I have this modified file, hello.py. I can do a git add. I can do a git commit with some commit message, maybe add a comment. OK. And then, finally, I can do a git push. Now, when I do a git push, again, it's going to give me an error. It says, no, you'll have to do the set-upstream flag, so I will use that flag and push it to GitHub. So let me quickly go into GitHub, and I see here itself, add-comments had recent pushes less than a minute ago. 

And on GitHub, it also prompts me to compare create a pull request. And that is what I want to do in this case, so let me go ahead and click on this. So I can open a pull request, which is basically a request to merge in the changes in this branch called add-comments back into the main branch. And it says that this is able to merge. These branches can be automatically merged, which means there are no merge conflicts. 

And on GitHub, I can specify a title. I can specify any comments, if I have any. And then I can create a pull request. Once I do that-- and this might seem familiar if you've seen your feedback pull requests in problem sets, because those use pull requests as well-- I can see in this pull request a list of the commits that will be merged into the main branch when the merge happens. And I can also see the changes that have been made. In this case, there's only one change, the line highlighted in green, which just adds this comment, saying, say hello. 

Now, hypothetically, I could assign a reviewer for this pull request. So maybe I want my team members to go through this code and make sure that everything looks good before it gets merged back into the main branch. So if somebody else is also in your group, you can specify their GitHub username over here. You can ask them to review it. 

And once everything has been reviewed, once everything looks good, once you've tested everything and it's good to go, you can then click on the Merge Pull Request button. And you can optionally add a comment, and you can confirm merge. So pull request successfully merged and closed. The add-comments branch can be safely deleted. So now you no longer need this pointer because the main branch contains these commits that you added to this add-comments branch. So I can safely delete this branch, and now if I go back into my main branch and click on hello.py, I will see the changes over here. 

Now, of course, you don't have to go through this entire process of creating branches and pull requests and so on. You could technically just directly push everything to the main branch. But the reason we encourage doing this is just to ensure that everybody on the team knows exactly what's going on. And you also have this whole process of code review, where different people look at the code and make sure that everything is good to go before you merge it into the main branch, which contains the working version. 

BERNIE LONGBOY: OK, and Tarun, we're in our last four minutes here, so if we have other questions or comments. Here's one from Jim. If a merge is done to the main branch and a bug is discovered later, what is the best way to handle the bug fix? 

TARUN PRASAD: Yeah, that's a really good question. So there are a few ways of handling bug fixes. If it's something that you want to undo, like maybe it's one very specific buggy commit that you want to undo, you can then use git revert, like we saw earlier. You can do git revert, followed by the commit hash of the buggy commit. And then you can then push those changes back into the main branch. 

If it's more detailed than that, maybe it's more specific. Maybe a bunch of commits are all buggy, or maybe you want to completely get rid of the history of the existence of that commit itself. There are ways of doing that as well, which we don't quite have time to go through now, but I'll just send over a couple of commands that you can look at, that you can definitely look up, and you can learn more about exactly how to work with these. Two of them that'll be very useful include git reset and git rebase -i, which is something called an interactive rebase. And yeah, those should help you learn more about this. 

BERNIE LONGBOY: What are your thoughts on GitHub Desktop App? 

TARUN PRASAD: That's also a good question. Lots of people find it very useful, especially when you first start using or learning Git. I personally wouldn't recommend it, just because I think it's very useful to learn the commands themselves, because that's what lets you understand, or it gives you the full power of what Git can do. But having said that, you can definitely download GitHub Desktop and use that for making a lot of this easier, because then you don't have to remember the commands. You can just click on the Add button and the Commit button, and that definitely makes things easier. 

BERNIE LONGBOY: OK, last two. One's a comment, and a question. Is it considered best practice to-- oops. Is it considered best practice to make our comments in code and pull merge requests in present tense, rather than past tense, i.e. add a comment versus added a comment? 

TARUN PRASAD: Yeah, also a really good question, and different people have very different conventions for this. I personally use present tense, like add a comment, because one way of thinking about commits is, rather than what you did in the past, you can think of a commit as what are the changes that will be made when you apply this commit. So when I apply this commit, it adds this comment, and so you can write the message in present tense. But of course, different people have very different opinions on this, and there's no one right answer. 

BERNIE LONGBOY: Thanks, Tarun. 

TARUN PRASAD: Yeah. Thank you all for coming.