# Remote Version Control

This notebook is based on the [Software Carpetry](https://software-carpentry.org/) lesson on [Git](http://swcarpentry.github.io/git-novice), with a few additions inspired by chapter 16 of the book "Effective Computation in Physics" by Anthony Scopatz and Kathryn D. Huff.

## Repository Hosting Services

There are several websites for hosting git source code repositories online.
For Open Source projects they are ususally free (i.e. anyone has read-access to the repository and can browse, clone or 'fork' it).

Private repositories, larger teams, continous integration (running automated tests) are often considered premium features.  At least GitHub and Bitbucket offer free education/academic packages at the time of writing (May 2017).

* GitHub (<https://github.com/>)
    * To register for a free account: <https://github.com/signup/free>
    * Student accounts include free private repositories:  
      <https://education.github.com/discount_requests/new>
    * Students Developer Pack: <https://education.github.com/pack>
* Bitbucket (<https://bitbucket.org>)
    * To register for a free account: https://bitbucket.org/account/signup/
    * Unlimited academic plan when signing up with an academic email address.
* GitLab.com (<https://about.gitlab.com/gitlab-com/>)
* SourceForge (<https://sourceforge.net>)

These services generally provide the following services (details may vary):

* Code/repository browser with syntax hightlighting
* Issue Tracker/Ticketing system
* Wiki and/or (static) webpage support
* User downloads
* Landing page support
* Some sort of access control/permissions

In this course we will be using GitHub, but everything should work analogous with one of the other sites as well.

## Creating a Remote Repository on GitHub

The first step is of course to sign up for an account for GitHub (or one of the other Repository Hosting Services).

**After logging in** on the website, we need to create a new repository:

![github-create-repo-01](http://swcarpentry.github.io/git-novice/fig/github-create-repo-01.png)


**We name** our new repository "planets" and click on "Create Repository":

![github-create-repo-02](http://swcarpentry.github.io/git-novice/fig/github-create-repo-02.png)

**As soon** as the repository is created, GitHub displays a page with a URL and some information on how to configure your local repository:

![github-create-repo-03](http://swcarpentry.github.io/git-novice/fig/github-create-repo-03.png)

**This effectively** does the following on GitHub’s servers:

```
$ mkdir planets
$ cd planets
$ git init --bare
```

Our local repository still contains our earlier work on mars.txt, but the remote repository on GitHub doesn’t contain any files yet:

![git-freshly-made-github-repo](http://swcarpentry.github.io/git-novice/fig/git-freshly-made-github-repo.svg)

## Declaring a Remote (git remote)

The next step is to connect the two repositories. We do this by making the GitHub repository a remote for the local repository. The home page of the repository on GitHub includes the string we need to identify it:

![Where to Find Repository URL on GitHub](http://swcarpentry.github.io/git-novice/fig/github-find-repo-string.png)

Click on the ‘HTTPS’ link to change the protocol from SSH to HTTPS.

#### HTTPS vs. SSH
> We use HTTPS here because it does not require additional configuration. 
> Later you may want to set up SSH access, which is a bit more secure, 
> by following one of the great tutorials from 
> [GitHub](https://help.github.com/articles/generating-ssh-keys), 
> [Atlassian/BitBucket](https://confluence.atlassian.com/display/BITBUCKET/Set+up+SSH+for+Git) 
> and [GitLab](https://about.gitlab.com/2014/03/04/add-ssh-key-screencast/)
> (this one has a screencast).

![Changing the Repository URL on GitHub](http://swcarpentry.github.io/git-novice/fig/github-change-repo-string.png)

**Copy that URL** from the browser, go into the local planets repository, and run this command:

```shell
$ git remote add origin https://github.com/vlad/planets.git
```

Make sure to use the URL for your repository rather than Vlad’s: the only difference should be your username instead of vlad.

We can check that the command has worked by running `git remote -v` :

```shell
$ git remote -v
```
```
origin   https://github.com/vlad/planets.git (push)
origin   https://github.com/vlad/planets.git (fetch)
```

The name origin is a local nickname for your remote repository. We could use something else if we wanted to, but origin is by far the most common choice.

## Sending Commits to Remote Repositories (git push)

Once the nickname origin is set up, this command will push the changes from our local repository to the repository on GitHub:

```shell
$ git push origin master
```
```
Counting objects: 9, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (9/9), 821 bytes, done.
Total 9 (delta 2), reused 0 (delta 0)
To https://github.com/vlad/planets
 * [new branch]      master -> master
Branch master set up to track remote branch master from origin.
```

#### Password Managers
> If your operating system has a password manager configured, `git push` will try to use 
> it when it needs your username and password. For example, this is the default behavior
> for Git Bash on Windows. If you want to type your username and password at the terminal
> instead of using a password manager, type:
>
> ```shell
> $ unset SSH_ASKPASS
> ```
> in the terminal, before you run `git push`. Despite the name, git uses `SSH_ASKPASS` for
> all credential entry, so you may want to unset `SSH_ASKPASS` whether you are using git
> via SSH or https.
>
> You may also want to add unset `SSH_ASKPASS` at the end of your `~/.bashrc` to make git
> default to using the terminal for usernames and passwords.

Our local and remote repositories are now in this state:

![GitHub Repository After First Push](http://swcarpentry.github.io/git-novice/fig/github-repo-after-first-push.svg)

#### The ‘-u’ Flag
> You may see a -u option used with git push in some documentation. This option is
> synonymous with the --set-upstream-to option for the git branch command, and is used to 
> associate the current branch with a remote branch so that the git pull command can be
> used without any arguments. To do this, simply use git push -u origin master once the
> remote has been set up.

We can pull changes from the remote repository to the local one as well:

```
$ git pull origin master
From https://github.com/vlad/planets
 * branch            master     -> FETCH_HEAD
Already up-to-date.
```

Pulling has no effect in this case because the two repositories are already synchronized. If someone else had pushed some changes to the repository on GitHub, though, this command would download them to our local repository.

## Exercises:

These exercises are taken from <http://swcarpentry.github.io/git-novice/07-github/>.
You will find the answers at the bottom of that page.

### GitHub GUI
Browse to your `planets` repository on GitHub. Under the Code tab, find and click on the text that says “XX commits” (where “XX” is some number). Hover over, and click on, the three buttons to the right of each commit. What information can you gather/explore from these buttons? How would you get that same information in the shell?

### GitHub Timestamp
Create a remote repository on GitHub. Push the contents of your local repository to the remote. Make changes to your local repository and push these changes. Go to the repo you just created on GitHub and check the timestamps of the files. How does GitHub record times, and why?

### Push vs. Commit
In this lesson, we introduced the “git push” command. How is “git push” different from “git commit”?


### Fixing Remote Settings
It happens quite often in practice that you made a typo in the remote URL. This exercice is about how to fix this kind of issues. First start by adding a remote with an invalid URL:

`git remote add broken https://github.com/this/url/is/invalid`

Do you get an error when adding the remote? Can you think of a command that would make it obvious that your remote URL was not valid? Can you figure out how to fix the URL (tip: use `git remote -h`)? Don’t forget to clean up and remove this remote once you are done with this exercise.




# Collaborating

## Adding Collaborators to your Project

For the next step, get into pairs. One person will be the “Owner” and the other will be the “Collaborator”. The goal is that the Collaborator add changes into the Owner’s repository. We will switch roles at the end, so both persons will play Owner and Collaborator.

### Practicing By Yourself
> If you’re working through this lesson on your own, you can carry on by opening a second
> terminal window. This window will represent your partner, working on another computer.
> You won’t need to give anyone access on GitHub, because both ‘partners’ are you.

The Owner needs to give the Collaborator access. On GitHub, click the settings button on the right, then select Collaborators, and enter your partner’s username.


![Adding Collaborators on GitHub](http://swcarpentry.github.io/git-novice/fig/github-add-collaborators.png)

To accept access to the Owner’s repo, the Collaborator needs to go to https://github.com/notifications. Once there she can accept access to the Owner’s repo.

## Downloading a Remote Repository (git clone)

Next, the Collaborator needs to download a copy of the Owner’s repository to her machine. This is called “cloning a repo”. To clone the Owner’s repo into her Desktop folder, the Collaborator enters:

```
$ git clone https://github.com/vlad/planets.git ~/Desktop/vlad-planets
```

Replace ‘vlad’ with the Owner’s username.

![After Creating Clone of Repository](http://swcarpentry.github.io/git-novice/fig/github-collaboration.svg)

The Collaborator can now make a change in her clone of the Owner’s repository, exactly the same way as we’ve been doing before:

```
$ cd ~/Desktop/vlad-planets
$ nano pluto.txt
$ cat pluto.txt
It is so a planet!
$ git add pluto.txt
$ git commit -m "Add notes about Pluto"
 1 file changed, 1 insertion(+)
 create mode 100644 pluto.txt
```

Then push the change to the *Owner’s repository* on GitHub:

```
$ git push origin master
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 306 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/vlad/planets.git
   9272da5..29aba7c  master -> master
```

Note that we didn’t have to create a remote called `origin`: Git uses this name by default when we clone a repository. (This is why `origin` was a sensible choice earlier when we were setting up remotes by hand.)

Take a look to the Owner’s repository on its GitHub website now (maybe you need to refresh your browser.) You should be able to see the new commit made by the Collaborator.

To download the Collaborator’s changes from GitHub, the Owner now enters:

```
$ git pull origin master
remote: Counting objects: 4, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 3 (delta 0)
Unpacking objects: 100% (3/3), done.
From https://github.com/vlad/planets
 * branch            master     -> FETCH_HEAD
Updating 9272da5..29aba7c
Fast-forward
 pluto.txt | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 pluto.txt
```

Now the three repositories (Owner’s local, Collaborator’s local, and Owner’s on GitHub) are back in sync.

#### A Basic Collaborative Workflow
> In practice, it is good to be sure that you have an updated version of the repository 
> you are collaborating on, so you should git pull before making our changes. The basic
> collaborative workflow would be:
> 
> * update your local repo with **`git pull origin master`**,
> * make your changes and stage them with **`git add`**,
> * commit your changes with **`git commit -m`**, and
> * upload the changes to GitHub with **`git push origin master`**
> 
> It is better to make many commits with smaller changes rather than of one commit 
> with massive changes: small commits are easier to read and review.


## Fetching the Contents of a Remote Repository (git fetch)

Above we have used the command `git pull` to download commits from a remote repository and integrate the changes into our working directory.  Sometimes we are only interested in downloading changes from a remote (e.g. an new branch) but don't want to integrate them in our branch right away. 

This can be achieved with the command `git fetch`:

```
$ git fetch origin
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 5 (delta 4), reused 5 (delta 4), pack-reused 0
Unpacking objects: 100% (5/5), done.
From github.com:vlad/planets
   8efa530..35954ef  master   -> origin/master
```

In this example there were commits on the `master` branch that we didn't have on our local repository. We therefore need to merge them into our `master` branch before we can push our changes back to the remote.

## Merging the Contents of a Remote (git merge)

To incooperate the changes from the remote's `master` branch into ours, we need to merge them, like we did with the `randomThoughts` branch in chapter 15.

Actually `git fetch` has placed these new commits in a new branch called `origin/master`.

To merge the changes into our current branch we use `git merge` :

```
$ git merge origin/master
Updating 8efa530..35954ef
Fast-forward
 saturn.txt | 2 ++
 1 file changed, 2 insertions(+)
 create mode 100644 saturn.txt
```

Looks like one of our colaboraros has worked on collection data on Saturn.

## Pull = Fetch and Merge (git pull)

As a `git fetch` followed by `git merge` is such a useful combination, the git developers have combined these into the command `git pull`.

Under the hood, the command `git pull <REMOTE> <BRANCH>` actually first executes `git fetch <REMOTE>` followed by `git merge <BRANCH>`.

## Conflicts

As soon as people can work in parallel, it’s likely someone’s going to step on someone else’s toes. This will even happen with a single person: if we are working on a piece of software on both our laptop and a server in the lab, we could make different changes to each copy. Version control helps us manage these conflicts by giving us tools to resolve overlapping changes.

To see how we can resolve conflicts, we must first create one. The file `mars.txt` currently looks like this in both partners’ copies of our `planets` repository:

```
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
```

Let’s add a line to one partner’s copy only:

```
$ nano mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
This line added to Wolfman's copy
```

and then push the change to GitHub:

```
$ git add mars.txt
$ git commit -m "Add a line in our home copy"
[master 5ae9631] Add a line in our home copy
 1 file changed, 1 insertion(+)
$ git push origin master
Counting objects: 5, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 352 bytes, done.
Total 3 (delta 1), reused 0 (delta 0)
To https://github.com/vlad/planets
   29aba7c..dabb4c8  master -> master
```

Now let’s have the other partner make a different change to their copy _without_ updating from GitHub:

```
$ nano mars.txt
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
We added a different line in the other copy
```

We can commit the change locally:

```
$ git add mars.txt
$ git commit -m "Add a line in my copy"
[master 07ebc69] Add a line in my copy
 1 file changed, 1 insertion(+)
```

but Git won’t let us push it to GitHub:

```
$ git push origin master
To https://github.com/vlad/planets.git
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to 'https://github.com/vlad/planets.git'
hint: Updates were rejected because the tip of your current branch is behind
hint: its remote counterpart. Merge the remote changes (e.g. 'git pull')
hint: before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.
```

![The Conflicting Changes](http://swcarpentry.github.io/git-novice/fig/conflict.svg)

Git detects that the changes made in one copy overlap with those made in the other and stops us from trampling on our previous work. What we have to do is pull the changes from GitHub, merge them into the copy we’re currently working in, and then push that. Let’s start by pulling:

```
$ git pull origin master
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 1), reused 3 (delta 1)
Unpacking objects: 100% (3/3), done.
From https://github.com/vlad/planets
 * branch            master     -> FETCH_HEAD
Auto-merging mars.txt
CONFLICT (content): Merge conflict in mars.txt
Automatic merge failed; fix conflicts and then commit the result.
```

## Resolving Conflicts

git pull tells us there’s a conflict, and marks that conflict in the affected file:

```shell
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
<<<<<<< HEAD
We added a different line in the other copy
=======
This line added to Wolfman's copy
>>>>>>> dabb4c8c450e8475aee9b14b4383acc99f42af1d
```

Our change -- the one in `HEAD` -- is preceded by `<<<<<<<`. Git has then inserted `=======` as a separator between the conflicting changes and marked the end of the content downloaded from GitHub with `>>>>>>>`. (The string of letters and digits after that marker identifies the commit we’ve just downloaded.)

It is now up to us to edit this file to remove these markers and reconcile the changes. We can do anything we want: keep the change made in the local repository, keep the change made in the remote repository, write something new to replace both, or get rid of the change entirely. Let’s replace both so that the file looks like this:

```shell
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
We removed the conflict on this line
```

To finish merging, we add mars.txt to the changes being made by the merge and then commit:

```shell
$ git add mars.txt
$ git status
On branch master
All conflicts fixed but you are still merging.
  (use "git commit" to conclude merge)

Changes to be committed:

	modified:   mars.txt

$ git commit -m "Merge changes from GitHub"
[master 2abf2b1] Merge changes from GitHub
```

Now we can push our changes to GitHub:

```shell
$ git push origin master
Counting objects: 10, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 697 bytes, done.
Total 6 (delta 2), reused 0 (delta 0)
To https://github.com/vlad/planets.git
   dabb4c8..2abf2b1  master -> master
```

Git keeps track of what we’ve merged with what, so we don’t have to fix things by hand again when the collaborator who made the first change pulls again:

```shell
$ git pull origin master
remote: Counting objects: 10, done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 2), reused 6 (delta 2)
Unpacking objects: 100% (6/6), done.
From https://github.com/vlad/planets
 * branch            master     -> FETCH_HEAD
Updating dabb4c8..2abf2b1
Fast-forward
 mars.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
```

We get the merged file:

```shell
$ cat mars.txt
Cold and dry, but everything is my favorite color
The two moons may be a problem for Wolfman
But the Mummy will appreciate the lack of humidity
We removed the conflict on this line
```

We don’t need to merge again because Git knows someone has already done that.

Git’s ability to resolve conflicts is very useful, but conflict resolution costs time and effort, and can introduce errors if conflicts are not resolved correctly. If you find yourself resolving a lot of conflicts in a project, consider one of these approaches to reducing them:

* Try breaking large files apart into smaller files so that it is less likely that two authors will be working in the same file at the same time
* Clarify who is responsible for what areas with your collaborators
* Discuss what order tasks should be carried out in with your collaborators so that tasks that will change the same file won’t be worked on at the same time

[![xkcd comic: git](https://imgs.xkcd.com/comics/git.png)](https://xkcd.com/1597/)
If that doesn't fix it, git.txt contains the phone number of a friend of mine who understands git. Just wait through a few minutes of 'It's really pretty simple, just think of branches as...' and eventually you'll learn the commands that will fix everything.

Source: "xkcd" by Randall Munroe, 2015-10-30

