Alan's Git Workflow

This page describes a Git workflow that I (Alan) prefer over the typical workflow. For an explanation of the default Phabricator/Git workflow, see the Using phabricator Forge page.

Motivation

By default, arc diff and arc land perform a sequence of steps that assume a particular Git workflow. These have a few problems:

Because the default behavior is a bit "magical" and depends heavily on git state, sometimes direct git operations will cause arc to get confused.
The default workflow doesn't make it very easy to maintain a chain of code reviews. In order to have $n$ ‍ commits out for review, you need an $n$ ‍-level deep branch hierarchy, which isn't fun to maintain.

My workflow uses more explicit steps so that you can use normal git operations to maintain a stack of commits.

Differences from the default Phabricator workflow

Instead of a code review corresponding to a range of commits, a code review always corresponds to exactly one commit. This means that updating a commit requires using git commit --amend or equivalent.
There are no requirements for the branch structure, so it should be fairly easy to adapt this workflow to any alternative workflow that you prefer.

Submitting a diff

First, write your code and commit it. To submit it for code review, run this command to submit your HEAD commit as a code review:

arc diff HEAD~1

The argument to arc diff is the "base" of your code review. The default behavior is to use your upstream as the base, so it takes all commits between your upstream and your current HEAD and creates a code review out of them. HEAD~1 refers to the commit before the current commit, so arc diff HEAD~1 is a way to say "take the current commit and submit it as a code review".

You can also write HEAD~ or HEAD^, both of which are equivalent to HEAD~1.

Normally you'd do this on a named branch, but nothing stops you from doing this on a detached HEAD or in the middle of an interactive rebase.

Updating a diff

To update the code for a diff (e.g. responding to code review feedback), just modify it and re-submit it in the same way. For example:

[make changes]
git add [files]
git commit --amend
arc diff HEAD~1

Again, there's nothing stopping you from doing this in the middle of an interactive rebase or in any other branch situation.

To update any commit metadata (e.g. title, commit message, test plan, reviewers, or subscribers), you need to update the review in Phabricator. The arc amend step below will eventually pull the changes into the commit message.

Landing a diff

Instead of arc land, you take a more explicit sequence of steps that's equivalent.

arc land takes several steps that we need to replace:

Pull in changes from the Phabricator review to the commit message. arc amend is the way to do only this step.
Squash all commits in the review into a single commit. This isn't necessary for us because we only ever have one commit per review.
Merge from your upstream branch. This is replaced with an explicit rebase.
Push your work to the upstream branch.

When you're ready to land a commit, you should update to the latest code and deal with any potential merge conflicts. Here's how to do it in my rebase-based workflow:

git fetch
git rebase origin/master

You then need to tell arc to update your commit message from the Phabricator review:

arc amend

As usual, this operates on your current (HEAD) commit and works regardless of your branch state.

The next step is to push the commit, but the details depend on the repository you're pushing to.

Landing in non-webapp repositories

This line pushes your current position (and everything between your position and origin/master) to the master branch in GitHub:

git push origin HEAD:master

Landing in webapp

To instruct the deploy system to deploy your work, you need to create some branch in the GitHub repo with your work, then instruct the deploy system to deploy that branch. First, as soon as it's my turn to deploy, I rebase again onto origin/master so that the deploy system has nothing to merge. Then, I push a new branch using a line like this:

git push origin HEAD:refs/heads/alan-deploy-aug6

As before, origin means that I'm pushing to the GitHub repo, and HEAD means that my current branch position is what I'm pushing.

Since the remote branch generally doesn't exist yet, I can't just say git push origin HEAD:alan-deploy-aug6, since Git doesn't know whether alan-deploy-aug6 refers to a branch or something else (e.g. a tag). The incantation refs/heads/alan-deploy-aug6 means "the branch alan-deploy-aug6".

I create a new branch each time instead of re-using the same branch name mostly as a matter of discipline. I'm not conceptually updating anything in the GitHub repo; I'm just adding new commits that I want to be able to point to. I eventually delete old branches.

Then I say something like "sun, deploy alan-deploy-aug6" and the deploy begins.

Managing multiple commits

This is where the workflow starts to become really useful. This workflow makes it easy to break commits down into smaller pieces without much additional friction, and there have been several times where I have had branches with more than 10 dependent commits in various stages of code review at once. To maintain a sequence of commits that are all out for code review, you can simply create multiple commits on top of each other within the work of a single branch. You can then use all of the features of interactive rebase: you can add new commits anywhere in the stack, you can reorder commits, you can drop commits, you can combine commits, and you can update intermediate commits (e.g. to address review comments).

As an example, here's how to update a commit in response to a review comment:

git rebase -i origin/master
[go to the commit you want to jump to and change "pick" to "edit"]
[it moves to that commit]
[make changes to your code]
git commit -a --amend
arc diff HEAD~1
git rebase --continue

A few notes when using this:

When landing a stack of commits, you need to use interactive rebase to run arc amend on each one individually. However, git push always operates on the range commits up to your current HEAD.
Similarly, there's no way to do a bulk update of all of your commits in code review aside from doing interactive rebase and running arc diff HEAD~1 at each point.
Because git commits are immutable, you must use interactive rebase (or something equivalent) to edit earlier commits, since you need to update the entire stack of commits. For example, it's incorrect to use git checkout to move to an earlier commit and to try to just run git commit --amend to update it; when you move back to your main branch, you will lose that work you just did.

More opinions from Alan

These are significantly more controversial, but I have some opinions on how to use and think about Git that somewhat motivate this workflow. You may find some of the arguments compelling as well.

I prefer rebase and avoid merge in most situations. I find linear history significantly easier to read and understand, and when I deploy work, I prefer commits to tell a relatively polished story of how the code steps through a sequence of intermediate good states, rather than trying to record the intermediate mistakes that I made along the way.
In most (but not all) situations, I do all of my work on a single named branch, even I have commits that are unrelated to each other. Having a single branch makes it easier to keep up to date with origin/master and to keep track of everything I have out for code review, and it makes it so I'm implicitly testing all of my work-in-progress changes whenever I do any testing. If I want to land some commits before others, I can use interactive rebase to move those commits earlier in the ordering and just push those.
I don't use upstreams. I have been annoyed by misconfigured upstreams significantly more than I have benefitted from properly-configured upstreams. I am almost never in a situation where I can't remember the name of the branch to rebase over or the branch to push to, so I view them as a minor convenience.
I generally don't have local and remote branches with the same name. Rather than using long-lived remote branches, I prefer a continuous integration style where any intermediate step could land on the master branch, and anything that isn't ready to show to users is behind a flag in the code.
In particular, I don't have a local master branch. Since I always develop on a named branch (which is required when working in webapp and recommended in other repositories), I don't find it useful to distinguish between a local state of master and the state of master in the GitHub repo. I've also dealt with and seen a lot of confusion from master and origin/master getting out of sync.

Have any questions or comments? See anything wrong? Let me know in the discussion area below!

Want to join the conversation?

Sort by:

No posts yet.