Not interested in explanations? Fast travel to the tl;dr 🚀
Open source communities on collaborative Git platforms (such as GitHub and GitLab) usually ask contributors to submit changes to the upstream repository from a fork.
Unlike an actual fork which branches off from upstream and has a life of its own, a contribution fork merely is a container for contribution branches: its lifecycle is tightly coupled with the upstream. Contributors are expected to submit changes based on a recent version of the upstream repository. Should the fork get outdated, they must sync with the upstream and rebase their work.
This is where things get murky. There are as many synchronization approaches as there are contributors. GitHub even has an entry in the documentation dedicated to syncing a fork.
For my part, I believe that a “no sync” approach is best 😄
Two branch types on a contribution fork:
In practice, mirror branches are useless: they only serve as starting point for contribution branches. The need for keeping them in sync with the upstream generates branch management overhead.
Could we contribute without needing to sync the fork at all? This is the basis behind the “no sync” approach.
upstream
remoteThis step is common with regular sync approaches.
By default, when we clone a repository, an origin
remote pointing to the source repository is registered in the local repository.
Here, I am cloning a personal fork of cilium/cilium:
$ git clone git@github.com:nbusseneau/cilium.git
Cloning into 'cilium'...
[...]
$ cd cilium
$ git remote -v
origin git@github.com:nbusseneau/cilium.git (fetch)
origin git@github.com:nbusseneau/cilium.git (push)
A default local branch is automatically checked out, and tracking the origin
default branch:
$ git branch -vv
* master 1070b19ab [origin/master] Add missing Demo App reference
As it is a fork, we can switch to any mirror branch available:
$ git switch v1.10
Branch 'v1.10' set up to track remote branch 'v1.10' from 'origin'.
Switched to a new branch 'v1.10'
$ git branch -vv
master 1070b19ab [origin/master] Add missing Demo App reference
* v1.10 75b4ed957 [origin/v1.10] build(deps): bump docker/setup-buildx-action
Note: in this post I will use git switch
over git checkout
, but you can use either.
Since a Git repository may interact with any number of remote Git repositories, let’s add the upstream repository as upstream
remote:
$ git remote add upstream git@github.com:cilium/cilium.git
$ git remote -v
origin git@github.com:nbusseneau/cilium.git (fetch)
origin git@github.com:nbusseneau/cilium.git (push)
upstream git@github.com:cilium/cilium.git (fetch)
upstream git@github.com:cilium/cilium.git (push)
We can now fetch branches from the upstream:
$ git fetch upstream
[...]
From github.com:cilium/cilium
* [new branch] master -> upstream/master
* [new branch] v1.9 -> upstream/v1.9
* [new branch] v1.10 -> upstream/v1.10
[...]
Most guides recommend to keep the mirror branches of a fork in sync with upstream using a pull-push pattern:
master
) from the upstream branch (e.g. upstream/master
).origin/master
):$ git switch master
$ git pull upstream master
$ git push
This last bit is precisely what we are not going to do: we are never going to sync mirror branches on the fork.
We are going to present three no sync approaches:
This first approach is the most simple.
We start with a repository as outlined above, with origin
and upstream
remotes setup.
Let’s have a look at the .git/config
file:
[branch "master"]
remote = origin
merge = refs/heads/master
[branch "v1.10"]
remote = origin
merge = refs/heads/v1.10
The local master
and v1.10
branches are both tracking mirror branches on the fork (origin
remote).
We are going to bypass the fork and work directly with upstream
.
To have an existing branch track the upstream
remote, we can either:
.git/config
and replace remote = origin
by remote = upstream
.git config branch.<BRANCH_NAME>.remote upstream
git branch <BRANCH_NAME> -u upstream/<BRANCH_NAME>
$ git branch master -u upstream/master
Branch 'master' set up to track remote branch 'master' from 'upstream'.
$ git config branch.v1.10.remote upstream
$ git branch -vv
master 1070b19ab [upstream/master: behind 104] Add missing Demo App reference
* v1.10 75b4ed957 [upstream/v1.10: behind 96] build(deps): bump docker/setup-buildx-action
[branch "master"]
remote = upstream
merge = refs/heads/master
[branch "v1.10"]
remote = upstream
merge = refs/heads/v1.10
To check out a new branch from the upstream repository and have it track upstream
directly, we can use --track
:
$ git switch -c v1.9 --track upstream/v1.9
Updating files: 100% (7131/7131), done.
Branch 'v1.9' set up to track remote branch 'v1.9' from 'upstream'.
Switched to a new branch 'v1.9'
$ git branch -vv
master 1070b19ab [upstream/master: behind 104] Add missing Demo App reference
v1.10 75b4ed957 [upstream/v1.10: behind 96] build(deps): bump docker/setup-buildx-action
* v1.9 f993696f9 [upstream/v1.9] Prepare for release v1.9.7
[branch "v1.9"]
remote = upstream
merge = refs/heads/v1.9
Note: --track
is also available for git checkout -b
, if you prefer it over git switch -c
.
Since we now have local upstream branches directly tracking the upstream repository, we can manage branches exactly like we would on a regular single-remote repository.
Usual pull to retrieve latest changes from the upstream:
$ git switch master
Switched to branch 'master'
Your branch is behind 'upstream/master' by 104 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
$ git pull
Updating 1070b19ab..dfc528bbe
Updating files: 100% (567/567), done.
Fast-forward
[...]
Usual pull-create pattern:
$ git switch master
Switched to branch 'master'
Your branch is behind 'upstream/master' by 104 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
$ git pull
Updating 1070b19ab..dfc528bbe
Updating files: 100% (567/567), done.
Fast-forward
[...]
$ git switch -c pr/foo
Switched to a new branch 'pr/foo'
$ git branch -vv
master 541214272 [upstream/master] install: Disable kube-proxy-replacement by default
* pr/foo 541214272 install: Disable kube-proxy-replacement by default
Usual pull-rebase pattern:
$ git switch master
Switched to branch 'master'
Your branch is behind 'upstream/master' by 104 commits, and can be fast-forwarded.
(use "git pull" to update your local branch)
$ git pull
Updating 1070b19ab..dfc528bbe
Updating files: 100% (567/567), done.
Fast-forward
[...]
$ git switch pr/foo
Switched to branch 'pr/foo'
$ git rebase master
Successfully rebased and updated refs/heads/pr/foo.
Compared to sync approaches, we have considerably reduced branch management overhead:
If we can pull directly from the upstream, we can also push directly to the upstream. If we have write privileges, we could accidentally commit to them and push to upstream directly.
Fortunately, we can easily prevent that happening. Two solutions:
git config
has an optional pushRemote
variable for branches, which overrides the previously set remote
variable for push operations.
We can register a non-existing pushRemote
to block push operations on specific branches:
$ git config branch.master.pushRemote DISABLE_PUSH
$ git push
fatal: 'DISABLE_PUSH' does not appear to be a git repository
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
[branch "master"]
remote = upstream
merge = refs/heads/master
pushRemote = DISABLE_PUSH
The pre-push
hook exists precisely for this use case.
We can create a .git/hooks/pre-push
hook in the repository with a list of branches for which to block push operations:
#!/bin/bash
protected_branches=('master' 'v1.10' 'v1.9')
while read local_ref local_oid remote_ref remote_oid; do
# Strip everything before last the '/'
# e.g. refs/heads/master -> master
current_branch=$(echo ${local_ref} | sed -e 's/.*\/\(.*\)/\1/')
for protected_branch in "${protected_branches[@]}"; do
if [ "${protected_branch}" = "${current_branch}" ]; then
echo "push denied: ${protected_branch} is protected"
exit 1
fi
done
done
$ git push
push denied: master is protected
error: failed to push some refs to 'github.com:nbusseneau/cilium.git'
Note: I personally block push operations via pushRemote
.
The pre-push
hook in example above seems to be working fine, but I wrote it in a few minutes and did not extensively test it.
Be careful! 😉
This second approach is a bit more complex.
From a clean repository, we start by setting up origin
and upstream
remotes as above:
$ git clone git@github.com:nbusseneau/cilium.git
Cloning into 'cilium'...
[...]
$ cd cilium
$ git remote add upstream git@github.com:cilium/cilium.git
$ git remote -v
origin git@github.com:nbusseneau/cilium.git (fetch)
origin git@github.com:nbusseneau/cilium.git (push)
upstream git@github.com:cilium/cilium.git (fetch)
upstream git@github.com:cilium/cilium.git (push)
We will directly use remote upstream branches via git fetch upstream <REMOTE_BRANCH>
, and thus never have to manage any local upstream branches.
Fetch-create pattern.
git pull
before branching off.upstream/<BRANCH_NAME>
as starting point:$ git fetch upstream master
[...]
From github.com:cilium/cilium
* branch master -> FETCH_HEAD
dfc528bbe..541214272 master -> upstream/master
$ git switch -c pr/foo upstream/master --no-track
Switched to a new branch 'pr/foo'
$ git branch -vv
master 1070b19ab [origin/master] Add missing Demo App reference
* pr/foo 541214272 install: Disable kube-proxy-replacement by default
Notice the use of --no-track
when creating the branch: if not provided, --track upstream/master
is assumed, which is not what we want.
Fetch-rebase pattern.
git pull
before rebasing.upstream/<BRANCH_NAME>
rather than a local branch:$ git fetch upstream master
[...]
From github.com:cilium/cilium
* branch master -> FETCH_HEAD
dfc528bbe..541214272 master -> upstream/master
$ git rebase upstream/master
Successfully rebased and updated refs/heads/pr/foo.
This approach is extremely lean and minimal:
Also does not require any safeguard against accidental pushes to upstream.
Branch management is non-standard.
We probably will want to set up Git aliases, notably not to forget the --no-track
flag.
This third approach is a compromise emerging from the other two.
It works like the upstream-tracking branches approach, but we incorporate a variant of the fetch patterns from the fetch-only approach via git fetch upstream <REMOTE_BRANCH>:<LOCAL_BRANCH>
:
$ git fetch upstream master:master
From github.com:cilium/cilium
1070b19ab..541214272 master -> master
This neat git fetch
trick allows to fetch a remote branch and update a local branch in one go, without having to check it out first:
git branch -vv
master 541214272 [upstream/master] install: Disable kube-proxy-replacement by default
* pr/foo 1070b19ab Add missing Demo App reference
I dubbed it the “upfetch”.
The upfetch is very efficient compared to the previous patterns:
git fetch upstream <REMOTE_BRANCH>
do not update local branches.Upfetch-create pattern:
$ git fetch upstream master:master
From github.com:cilium/cilium
1070b19ab..541214272 master -> master
$ git switch -c pr/foo
Switched to a new branch 'pr/foo'
$ git branch -vv
master 541214272 [upstream/master] install: Disable kube-proxy-replacement by default
* pr/foo 541214272 install: Disable kube-proxy-replacement by default
Upfetch-rebase pattern:
$ git fetch upstream master:master
From github.com:cilium/cilium
1070b19ab..541214272 master -> master
$ git rebase master
Successfully rebased and updated refs/heads/pr/foo.
Best of both worlds:
Same as upstream-tracking branches: need to prevent accidental pushes to the upstream.
In my opinion, syncing forks is fundamentally useless. A contribution fork merely is a container for contribution branches: non-contribution branches are only mirroring upstream, and syncing them is unnecessary since we can directly use the upstream.
In this post, we propose three “no sync” approaches:
git fetch
trick.In all cases, we never sync the fork, which reduces branch management overhead.