Development using git

Git for beginners

For a summary of the basic git commands you can use this git-cheat-sheet.pdf.

Another starting point for working with git is the documentation at Atlassian which is the company behind bitbucket. If you prefer to learn it interactively you might want to try out this interactive git site.

Getting a remote repository to your computer

The following will create the directory twoears on your local computer including all the code from the Two!Ears main repository:

$ git clone https://github.com/TWOEARS/TwoEars.git twoears

Note

If you are a member of Two!Ears you should always clone from the internal repositories and mirror it on github. This would mean to clone the main repository with the following command:

$ git clone https://dev.qu.tu-berlin.de/git/twoears-main twoears

Adding/changing files

Let us assume you added a file additional_work.txt and changed the file great_work.txt. You can always see what you changed locally by one of the following commands:

$ git status
$ git diff

Before doing the actual commit you first have to add the files you want to commit, for example

$ git add additional_work.txt great_work.txt

Now you can create a commit together with a meaningful commit message

$git commit -m "Added description of additional work not described in great_work.txt"

Staying up to date with the remote repository

If you changed and committed something the changes are still only on your local machine you have to push it to the remote until everyone can see it. Before pushing and also before starting to work on something the next day it is always a good idea to pull the latest changes from the remote. This is then summarised by these two commands

$ git pull
$ git push

In order to get an overview of the latest changes you can run

$ git log

Getting further help

Most of the commands of git can be really powerful and allow a lot of different stuff. In order to see what a command is capable of you can use the internal help for specific commands for example

$ git reset --help

And another good way is to use google in order to find the right command, for example google after

git undo last commit

Developing and branching

Let us assume you want to develop a new feature which uses a circular room instead of a shoe boxed one and you know that a lot of files will be involved and it will take some time. This is the perfect example to create a new feature branch for the development. So let’s start with doing this, create a new branch and switch to that branch in order to start working on the new feature.

$ git branch circular_room
$ git checkout circular_room

If you want to include others in the development it is a good idea to also push this branch to the remote repository in order to allow others to pull it. This is one of the commands I always forget, but you can just ask google with git push branch and you will find the following command

$ git push -u origin circular_room

And if you are another person and want to contribute to that branch, you can get it from the remote with

$ git checkout -b circular_room origin/circular_room

After finishing your development and testing of the new feature it’s time to integrate the branch back into master. Before doing this it is a good idea to first include the latest changes of master into your branch, which you also should do from time to time via

$ git merge master

Note

Note, that this command assumes that you are currently in your branch.

Now go back to your master branch and insert the new feature there

$ git checkout master
$ git merge circular_room

After adding the feature to the master branch you can delete your feature branch, both locally and the remote one

$ git push origin --delete circular_room      # delete remote branch
$ git branch -D circular_room                 # delete local branch

The basic concept of the feature branch is described with more examples here: https://www.atlassian.com/git/workflows#!workflow-feature-branch

Remote branches

In order to get an overview of all your local as well as the remote branches, type:

$ git branch -a

If someone deleted a remote branch in between it might be necessary that you update your list of remote branches first

$ git remote update origin --prune

Git advanced commands

Storing credentials

For our https://dev.qu.tu-berlin.de address you have to provide your user name and password every time you push or pull or clone something from the server. In order to avoid retyping your password every time you can let git store it locally for some time with the following commands (works for git since version 1.7.10)

LINUX:

$ git config --global credential.helper cache
$ git config --global credential.helper "cache --timeout=3600"

The last command sets the storage time to 1 hour. You can change this if you like.

WINDOWS (GitBash):

$ git config --global credential.helper wincred

Your credentials are going to be saved during the next access.

Publishing our code at github

We have a Two!Ears page at github: https://github.com/TWOEARS There we publish our final software and data. Of course you maybe want not publish all the code you have at the moment, but only parts of it. This can be done by creating a branch that includes all things you would like to publish and then setting this branch up to track the github remote master branch. This can be done with the following commands, for the data repository:

$ git remote add github git@github.com:TWOEARS/data.git
$ git fetch github
$ git checkout -b public --track github/master

Note

Note, that you need to register at github and added to the TWOEARS group if you like to publish the code yourself. Otherwise you can create pull requests.

If you decide at one point that you want add the directory folder1 from master branch to the public branch you can do it by checking out the public branch and running the following command

$ git checkout master -- folder1

If you want to just mirror all your changes you do on the master branch at https://dev.qu.tu-berlin.de you can add the following entry to the .git/config file in the repository.

[alias]
    push-all = !git push origin master && git push github master

Now you can use the following command to push your changes of the master branch directly to https://dev.qu.tu-berlin.de and https://github.com:

$ git push-all

Working together with a svn repository

We have a few svn repositories in the Two!Ears project. You can also use git to work with those repositories. The only thing you have to use is the git-svn extension.

Now you can clone the svn repository

$ git svn clone https://dev.qu.tu-berlin.de/svn/twoears-repo-path

Do your changes and add them with standard git commits on your local PC. At the end you push all your changes to the remote svn repository via

$ git svn dcommit

In order to update your repository with changes from others you have to run

$ git svn rebase

Removing commits with large files

Use the BFG Repo-Cleaner to remove large files from your commit history. Just download the pre-compiled JAR file and copy it as bfg to a folder where your system looks for executables. Now let’s say you want to clean up the repository big-repo and remove all files larger than 2 MB, first create a local copy of it:

$ git clone --mirror /path/to/big-repo big-repo-copy
$ echo "Remove large files" > banned.txt
$ bfg --strip-blobs-bigger-than 2M --replace-text banned.txt repo.git
$ cd big-repo-copy
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive
$ git push

For further infos on this topic have a look at the discussion on removing large files.

Split repository

If you want to split an existing repository into two, you can do it the following way. Assume we have the repository <big-repo> and want to extract a sub folder <name-of-folder>:

# prepare old repo
$ cd <big-repo>
$ git subtree split -P <name-of-folder> -b <name-of-new-branch>
$ cd ..
# create new repo
$ mkdir <new-repo>
$ cd <new-repo>
$ git init
$ git pull </path/to/big-repo> <name-of-new-branch>
# clean up
$ rm -rf .git/refs/original/ && \
  git reflog expire --all && \
  git gc --aggressive --prune=now
$ git reflog expire --all --expire-unreachable=0
$ git repack -A -d
$ git prune

Alternatively you could delete existing folders in your existing repository in order to create a new one out of it:

# clone the repo
$ git clone <big-repo> tmp.git
$ cd tmp.git
$ ls
 test1 test2
# remove original url
$ git remote rm origin
# Remove the folder test1 and all related commits from history
$ git filter-branch --force --index-filter \
  'git rm -r --cached --ignore-unmatch test1' --prune-empty \
  --tag-name-filter cat -- --all
# clean up space
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

After that you have your first new repository for which you can set up a new remote and push it. For the other part of the splitting repeat the above steps for the test2 folder.

For further details have a look at the discussion on how to split repositories.

Git under Windows

  • Apart from the git installation on Windows, I recommend TortoiseGit for a nice user interface.
    • it has to be configured to use the below mentioned git for windows as back-end (usually should be configured like that by default): you can check in (right-click in Explorer)->TortoiseGit/Settings/General/Git for Windows. (Click the version -> Check now button to ensure it says at least git version 1.9.2.msysgit.0.
  • If you want to use the Git bash, you need to install the newest git for windows.
    • with a right-click on a folder, you can then start the git bash from that point

Git with large binary files

This topic is important for our databases, the public and the internal. At the moment we are simply using git, which is no longer correctly working due to the large repository size (for example a new cloning of the internal data repository is not possible at the moment).

We are working on a solution at the moment, and will most likely use Git-Media.