Re: run pgindent on a regular basis / scripted manner

From: Noah Misch <noah(at)leadboat(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Jelte Fennema <postgres(at)jeltef(dot)nl>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Jesse Zhang <sbjesse(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: run pgindent on a regular basis / scripted manner
Date: 2023-04-22 19:59:06
Message-ID: 20230422195906.GB1708514@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(Given that another commentator is "absolutely against" a hook, this message
is mostly for readers considering this for other projects.)

On Sat, Apr 22, 2023 at 03:23:59PM +0200, Magnus Hagander wrote:
> On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> > > Also, pgindent takes tens of seconds to run, so hooking that into the git
> > > push process would slow this down quite a bit.
> >
> > The pre-receive hook would do a full pgindent when you change typedefs.list.
> > Otherwise, it would reindent only the files being changed. The average push
> > need not take tens of seconds.
>
> It would probably ont be tens of seconds, but it would be slow. It
> would need to do a clean git checkout into an isolated environment and
> spawn in there, and just that takes time.

That would be slow, but I wouldn't do it that way. I'd make "pg_bsd_ident
--pre-receive --show-diff" that, instead of reading from the filesystem, gets
the bytes to check from the equivalent of this Perl-like pseudocode:

while (<>) {
my($old_hash, $new_hash, $ref) = split;
foreach my $filename (split /\n/, `git diff --name-only $old_hash..$new_hash`) {
$file_content = `git show $new_hash $filename`;
}
}

I just ran pgindent on the file name lists of the last 1000 commits, and
runtime was less than 0.5s for each of 998/1000 commits. There's more a real
implementation might handle:

- pg_bsd_indent changes
- typedefs.list changes
- skip if the break-glass "pgindent: no" appears in a commit message
- commits changing so many files that a clean "git checkout" would be faster

> And it would have to also
> know to rebuild pg_bsd_indent on demand, which would require a full
> ./configure run (or meson equivalent). etc.
>
> So while it might not be tens of seconds, it most definitely won't be fast.

A project more concerned about elapsed time than detecting all defects might
even choose to take no synchronous action for pg_bsd_indent and typedefs.list
changes. When a commit changes either of those, the probability that the
committer already ran pgindent rises substantially.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-04-22 20:15:23 Re: run pgindent on a regular basis / scripted manner
Previous Message Tom Lane 2023-04-22 19:58:13 Re: run pgindent on a regular basis / scripted manner