1.1. How to use Git¶
HORTON uses Git for version control.
A version control systems (VCS) allows many people to copy and modify the same source code while keeping track of all changes made. The VCS software also helps you merge different developments into one common source tree.
To refresh your mind on commonly used Git commands, please refer to Git Reference.
This section goes through the basics of VCSs to get you started with developing new features in HORTON. Although the commands below are specific to Git, the following entails good practices that can be generalized and applied to other modern VCS such as Bazaar or Mercurial.
1.1.1. Installing Git¶
The installation of git is covered in the sections Development tools (Linux) or Development tools (Mac).
1.1.2. Git configuration¶
We recommend you to set the following in your ~/.gitconfig
file:
[user]
name = {Replace by your official name: First M. Last}
email = {Replace by a decent e-mail address, cfr. corresponding author on a paper.}
[color]
diff = auto
status = auto
interactive = auto
branch = auto
[push]
default = simple
Also, install our pre-commit script as follows:
cp -a tools/pre-commit .git/hooks/
This hook imposes some baseline quality checks on each commit:
#!/bin/bash
red="\033[1;31m"
color_end="\033[0m"
# Check unwanted trailing whitespace or space/tab indents;
if [[ `git diff --cached --check` ]]; then
echo -e "${red}Commit failed: trailing whitespace, trailing empty lines, DOS/Windows line endings${color_end}"
git diff --cached --check
exit 1
fi
# Check for untracked files (not in .gitignore)
if [[ `git status -u data horton doc scripts tools -s | grep "^??"` ]]; then
echo -e "${red}Commit failed: untracked files (not in .gitignore).${color_end}"
git status -u data horton doc scripts tools -s | grep "^??"
exit 1
fi
# Check for new print statements
if [[ `git diff --cached | grep '^+' | sed 's/^.//' | sed 's:#.*$::g' | grep 'print '` ]]; then
echo -e "${red}Commit failed: print statements${color_end}"
git diff --cached | grep '^+' | sed 's/^.//' | sed 's:#.*$::g' | grep print
exit 1
fi
The last part of the pre-commit
script checks for python print
lines. These should
not be used in the HORTON library. If you think you have legitimate reasons to ignore this
check, use the --no-verify
option when comitting.
Furthermore, it is useful to include the current branch in your shell prompt. To
do so, put one of the following in your ~/.bashrc
(Linux) or
~/.bash_profile
(Mac OS X) file:
For terminals with a dark background:
GIT_PS='$(__git_ps1 ":%s")' export PS1="\[\033[1;32m\]\u@\h\[\033[00m\] \[\033[1;34m\]\w\[\033[00m\]\[\033[1;33m\]${GIT_PS}\[\033[1;34m\]>\[\033[00m\] "
For terminals with a light background:
GIT_PS='$(__git_ps1 ":%s")' export PS1="\[\033[2;32m\]\u@\h\[\033[00m\]:\[\033[2;34m\]\w\[\033[3;31m\]${GIT_PS}\[\033[00m\]$ "
You can customize it to your taste. You may also want to add the export
PROMPT_DIRTRIM=3
line to keep the shell prompt short. If you are a happy vim
user, you can set export EDITOR=vim
to get syntax highlighting when writing
commit messages.
1.1.3. Some terminology¶
- Patch
- A set of changes in the source code. These are typically recorded in a patch file. Such a file specifies a set of lines that are removed and a set of lines that are added.
- SHA-1 hash
- A numerical checksum of a given length in bytes (in this case 256) for a much larger amount of data, e.g. a very long character string. There are usually two main goals when designing hashing algorithms: (i) it is not possible to derive the original data from a hash and (ii) a small change in the original data completely changes the hash. The MD5 checksum is well known and often used for CD images, but it is not great in terms of the above two hashing objectives.
- Commit
- A patch with some extra information: author, timestamp, a SHA-1 hash of the code to which it applies, and some other things.
- Branch
A series of commits that describe the history of the source code.
In realistic projects, the source code history is not linear, but contains many deviations from the
master
branch where people try to implement a new feature. It is, however, useful to have only one official linear history. We will show below how this can be done with git.- Branch HEAD
- The last commit in a branch.
1.1.4. Cloning the HORTON git repository¶
In order to clone the public HORTON repository, run the following commands:
mkdir ~/code
cd ~/code
git clone git://github.com/theochem/horton.git
cd horton
The version history can be updated with the latest committed patches on GitHub by:
git pull
There is also a web interface to HORTON’s git repository: https://github.com/theochem/horton
1.1.5. Additional steps required to build the development version of HORTON¶
Several parts of HORTON make use of reference atomic computations. These files are too large to be included in the git revision system. Therefore, they must be downloaded separately when compiling a development version of HORTON:
(cd data/refatoms; make all)
1.1.6. Work flow for adding a new feature¶
The development of a new feature typically consists of the following steps:
- You make modifications of the code in a topic branch. You test and document your modifications, fix problems where needed.
- Make a pull request on Github. (Some tests will be automatically executed.) Someone will review your pull request, which usually leads to suggestions to improve your modifications.
- As soon as you pull request is up to snuff, it will be merged into the master branch.
Note
Try to keep the amount of work in one branch as low as possible and get it reviewed/merged as early as possible. This takes some planning, as you have to figure out how to break your big plans up into smaller steps. In general this is a good exercise that will help you write more modular code. Although this seems to be cumbersome, it does save time for everyone involved.
When you intend to make relatively large modifications, it is recommended to discuss these first, e.g. on the HORTON mailing list, just to avoid disapointments in the long run.
1.1.6.1. Develop the feature in a topic branch¶
Fork the public HORTON repository on Github (if not done yet), clone it on your local machine and enter the source tree:
$ ~/code> git clone https://github.com/your_account/horton.git $ ~/code> cd horton $ ~/.../horton:master>
where
your_account
needs to be replaced by your Github account name.Switch to the
master
branch, if needed:$ ~/.../horton:foo> git checkout master $ ~/.../horton:master>
Make sure there are no uncommitted changes in the source code on the
foo
branch before switching to themaster
branch.Get the latest version of the source code:
$ ~/.../horton:master> git pull origin
Make a topic branch, say
bar
, and switch to it:$ ~/.../horton:master> git checkout -b bar $ ~/.../horton:bar>
Make sure that you are on the right branch before starting to implement the new feature
bar
. (Try to pick a more meaningful branch name based on the feature you are implementing.)Now you are in the right place to start making changes to the source code, and committing patches. When adding a new feature, also add tests, documentation, docstrings, comments and examples to clarify and debug the new feature. (The more tests, documentation and examples, the better.)
Review your changes with
git diff
. Make sure there are no trailing white spaces or trailing blank lines. These can be removed with the./cleancode.sh
script. If you created new files, run the./updateheaders.py
script to make sure the new files have the proper headers.Get an overall overview of the added changes and new files with
git status
.Add the changed files that will be committed with
git add <file_name>
command. There are two ways to do this:Add all changes in certain files:
$ ~/.../horton:bar> git add horton/file1.py horton/file2.py ...
Add interactively by going through the changes in all/some files:
$ ~/.../horton:bar> git add -p [horton/file1.py horton/file2.py ...]
Commit the added files to your working branch:
$ ~/.../horton:bar> git commit
This command will start an editor in which you can write a commit message. By convention, such a message starts with a short single-line description of at most 69 characters. Optionally, a longer description follows that is separated from the short description by an empty line. More suggestions for writing meaningful commit messages can be found here. If you only intend to write a short description, it can be included on the command line:
$ ~/.../horton:bar> git commit -m 'Short description'
In practice, you’ll make a couple of commits before a new feature is finished. After committing the changes and testing them thoroughly, you are ready for the next step.
1.1.6.2. Make your branch available for review with a pull request (PR)¶
In order to let others look at your code, you have to make your branch available by pushing it to your forked Github repository.
Push your branch to the remote server:
git push origin bar:bar
Now go to the Github website and make a Pull Request with the
master
branch of thetheochem/horton
repository as the destination. As soon as you do this, a series of basic QA tests will be executed to check for common problems. If these basic QA tests pass, someone will review your branch manually based on the Branch review checklist. You fix all the issues brought up during the review by making additional commits or, if you really messed up, by rewriting your branch. As soon as you push your changes back to the branch in your forked repository, they will show up in the PR, which triggers again the QA tests. When there are no further comments, your branch is ready to be merged.
1.1.6.3. Merging your pull request with the master branch¶
You don’t have to do anything for this, unless other branches got merged into
the master branch after you started your topic branch. In that case, you need to rebase
your topic branch on the current master
branch and rerun all tests. This can be done
with the following steps:
- Synchronize the
master
branch in your fork with the official HORTON repository.
Switch to your topic branch:
$ ~/.../horton:master> git checkout bar $ ~/.../horton:bar>
Create a new branch in which the result of
git rebase
will be stored:$ ~/.../horton:bar> git checkout -b bar-1 $ ~/.../horton:bar-1>
Rebase
your commits on top of the latestmaster
branch:$ ~/.../horton:bar-1> git rebase master
This command will try to apply the patches from your topic branch on top of the
master
branch. It may happen that changes in themaster
branch are not compatible with yours, such that your patches cannot be simply applied. When that is the case, thegit rebase
script will be interrupted and you are instructed on what to do. Do not panic when this happens. If you feel uncertain about how to resolve conflicts, it is time to call your git-savvy friends for help.After the rebase procedure is complete, run all the tests again. If needed, fix problems and commit the changes.
Upload the commits to your fork:
$ ~/.../horton:bar-1> git push origin -f bar-1:bar
This will rewrite the history of your topic branch, which will also show up in the PR. All automatic QA tests will be executed again.
1.1.7. Common issues¶
Remember to set the
pre-commit
hook. If this causes error messages when committing, use thecleancode.sh
script. This removes all sorts of trailing white-space and converts every tab to four spaces. These conventions makegit diff
more meaningful and make it easier to merge and rebase commits.When you are customizing your bash prompt, you may get an error like
__git_ps1: command not found...
, if you sourcedgit-completion.bash
. Then, before setting theGIT_PS
, you need to add the following line to your~/.bashrc
(Linux) or~/.bash_profile
(Mac OS X):source /usr/share/git-core/contrib/completion/git-prompt.sh
If you cannot find this file, you can get it from the link below:
https://github.com/git/git/blob/master/contrib/completion/git-prompt.sh