Bzg/woof: online monitor for mailing list based patch workflow

lima · on April 17, 2021

In somewhat related news, Gerrit has a new frontend and it's awesome! Google has a couple of UX people working on it now and it shows.

As a big fan of patch-based code review, I still can't stand GitHub PRs. It's very annoying to stack GitHub PRs, so the default is to create large PRs instead. This is sad, because smaller units of review result in better code! Gerrit solves this (and much more).

Example change on Gerrit (click on "Comments"):

https://gerrit-review.googlesource.com/c/gerrit/+/303182

Reasons to use Gerrit in spite of the learning curve:

- Ability to review commit messages (they're part of the codebase!).

- Show diffs between different versions of a change, even across rebases, with inline comments.

- Inline comments aren't lost on rebases and will be ported.

- Changes can be easily stacked.

- Select code snippets for inline comments, not just lines.

- It's really fast and there's keyboard shortcuts for everything.

- Rebase and cherry pick to release branches from the UI.

Some projects like Go, Cue or SQLAlchemy use custom tooling or Copybara[1] to mirror PRs to Gerrit, which seems like the best of both worlds - simple PR workflow for drive-by contributions, patch workflow for core contributors who work with it all day.

[1]: https://github.com/google/copybara

kettro · on April 18, 2021

I’ve used Gerrit at 3 different places, now including at Google. It is, by far, the best git experience.

The patch workflow is smooth and just a great experience. It is vastly improved by some git aliases (which you can set up in your gitconfig), but is, in general, very smooth. It allows for local branches to a single upstream, topic tagging, and a reasonable learning curve. Much better than GH.

It would be very well served to have a web-based offering (like GitHub/Lab).

da39a3ee · on April 18, 2021

> The patch workflow is smooth a

Can you explain what a "patch workflow" is and in what sense Github is not a "patch workflow"?

There's some confusion since some people associate "patch workflow" with an email-based workflow.

lima · on April 18, 2021

A patch/diff workflow operates at the level of individual diffs, rather than a branch identified by base and head refs like with pull requests.

Email is one (very clunky) way of implementing such a workflow, there's other tools like Gerrit, Reviewable or Phabricator that are more friendly.

da39a3ee · on April 18, 2021

Thanks. Can you explain why that might be a good thing?

If I update the diff, does it become a new reviewable item, independent from the first version?

I've never worked at a large tech company but my understanding is that at e.g. google there is some sort of monorepo, so it would presumably zoom forwards while a commit is under review. Even so, why is it not a good thing to review the commit in an explicit location in the historical commit graph?

lima · on April 18, 2021

Patches would still have a base, it's really more about about the data model. It's possible to synthesize a GitHub PR from a Gerrit CL and vice-versa.

In fact, Gerrit is entirely based on Git and you can simply clone any given CL.

da39a3ee · on April 18, 2021

It’s actually very convenient now to stack GitHub PRs; the target branch detector makes it easy to arrange a cascade, and the target branches update automatically as the earlier ones merge.

Can you explain what a patch based workflow means? I’ve seen patches in email. But to be honest I really only know Github.

What I would say in my apparent ignorance is that a patch doesn’t exist in isolation; it has, or should have, a parent — a certain code base state explicitly declared as its context. Otherwise the entire activity of review is not well-defined, since no-one has specified what the state of the code base is, and therefore the behavior of the software is unknowable.

bonzini · on April 18, 2021

Patches are sent as emails with a format like

    Commit message
    --
    Diffstat
    
    Diff

And reviews happen simply by replying to the email. If the maintainer is satisfied the email can be piped in a program that applies the patch.

Most programs and people that use email for patches have pretty sophisticated tooling built around it. See for example a series of four patches (i.e. a pull requests consisting of four commits) in Patchew: https://patchew.org/QEMU/[email protected]...

Notably, Patchew can provide a diff-of-diffs view that isn't there in either GitHub or Gitlab: https://patchew.org/QEMU/[email protected]...

(Patchew was started by a colleague of mine at Red Hat).

da39a3ee · on April 18, 2021

Using email to review patches is completely absurd. The following is a contentious statement but there's no avoiding it: it's done nowadays by people who started doing it in the 80s and 90s and are being stubborn. Here are some reasons email should not be used for review, if any are really needed:

- A "patch" must have a defined parent. Otherwise it is not defined what state of the overall codebase is being reviewed.

- Proposed changes evolve through complex lifecycles of comments/discussions and updates. Some comments are global and some should be attached to specific lines of code as it exists in a specific revision. This complexity is (obviously!) better modeled by a dedicated UI than by shoe-horning it into plain text email.

- It makes sense to attach the dynamic status of test runs to these updates. This can be done in a web UI but cannot be done in an email.

- Commenting on individual lines of code is desirable and (obviously!) better done by attaching a discussion thread to a defined (revision + code coordinates) location in a diff view than by replying inline in an email. The reasons (if they really need stating) include (1) that breaking a diff manually in an email and replying inline makes it hard to assemble the diff in your head, (2) it is not possible to have clearly readable discussion threads involving multiple people with each thread attached to a certain line in the diff, (3) it's certainly not possible to keep that discussion coherent when the underlying patch is evolving.

- Not everyone uses, or wishes to use, plain text emails.

- Not everyone uses or wishes to use a text-based email client that runs on their local machine. Nor do they have, or wish to have, their emails present on their local machine (they want their emails stored online and accessed via a web interface).

(And yes, I am someone who has used mutt, and gnus, and dovecot, and offlineimap.)

- The history/evolution of a codebase can be represented by a graph structure. It makes sense for review discussion to be associated with this graph structure. This is what Bitbucket/Github/Gitlab provide. Email does not provide this. There's a risk with email that it won't even be publicly searchable, but supposing it exists, a web-hosted archive of emails is not an effective way to comprehend the history of a codebase.

As a cultural observation, the people who are still insisting on email are often the sorts of people who insist that email should be plain text and actually look down upon technical people who like web-hosted email clients and HTML emails, and they are sometimes the sorts of people who deride Github as being nonserious -- full of "kids" who carelessly propose ill-thought out change sets and expect them to be incorporated into public projects. Neither of those positions are reasonable or speak highly of the holder.

lima · on April 18, 2021

Note that those points are particular disadvantages of the email workflow - all of these things are solved in modern tools like Gerrit.

da39a3ee · on April 18, 2021

If you don't know what the parent is in Gerrit then how can you run CI on the patch?

And if you do know what the parent is then how is it different from a "PR"?

lima · on April 18, 2021

A PR is roughly the same as a stack of Gerrit CLs.

The difference is that Gerrit tracks the evolution of, and reviews, each individual commit.

With a PR, you lose history when rebasing and all commits have to be approved as a whole.

da39a3ee · on April 18, 2021

> The difference is that Gerrit tracks the evolution of, and reviews, each individual commit.

What does it mean for an individual commit to evolve? I think of commits as being the atoms of the evolutionary process; i.e. evolution is a sequence of commits.

> With a PR, you lose history when rebasing

I don't quite understand that point. (I know what rebasing is.)

> and all commits have to be approved as a whole.

Hm, so why would one want to approve an individual "commit"? In my world a commit should be a meaningful change but isn't necessarily self-contained and hence not approvable. This sounds like what you're calling commits are in fact independent / self-contained and deployable units of change. So then

> A PR is roughly the same as a stack of Gerrit CLs.

Does a "CL" contain one or more than one of the entities you refer to as "commits"above?

Isn't it the other way around? A Gerrit thing is a stack of mini "PR"s, where each "PR" is a small, but independent / self-contained and deployable, unit of work. In any case, a PR contains many non-self-contained units (commits) so I'm having a hard time equating a PR and a stack of CLs.

lima · on April 18, 2021

> It’s actually very convenient now to stack GitHub PRs

More convenient than before, but nowhere near as convenient as keeping everything on a single branch so that it can be rebased in one go.

throwaway3699 · on April 18, 2021

Oh wow. Gerrit looks almost completely like Critique (the internal code review tool at Google) now. I really enjoy the user interface and whole flow of Gerrit.

davvid · on April 17, 2021

The Git project was recently discussing[1] pain points in their mailing list patch workflow. It's worth a read to get a feel for issues projects encounter.

Another related project that's worth mentioning is GitGitGadget[2][3], a tool that bridges the divide between Github PRs and Git's mailing list patch workflow.

[1] https://public-inbox.org/git/[email protected]/T/#...

[2] https://gitgitgadget.github.io/

[3] https://github.com/gitgitgadget/gitgitgadget

brian_herman · on April 17, 2021

Cool any projects use this? What advantages does it have over git or fossil?

fyskij · on April 18, 2021

org-mode[0], which is maintained by the same author, uses Woof.

[0]:https://updates.orgmode.org/