Re: make dist using git archive

From: "Tristan Partin" <tristan(at)neon(dot)tech>
To: "Peter Eisentraut" <peter(at)eisentraut(dot)org>
Cc: "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: make dist using git archive
Date: 2024-01-24 17:57:08
Message-ID: CYN4PXK5J7KP.247K22MR0MM1G@neon.tech
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed Jan 24, 2024 at 10:18 AM CST, Tristan Partin wrote:
> On Tue Jan 23, 2024 at 3:30 AM CST, Peter Eisentraut wrote:
> > On 22.01.24 21:04, Tristan Partin wrote:
> > > I am not really following why we can't use the builtin Meson dist
> > > command. The only difference from my testing is it doesn't use a
> > > --prefix argument.
> >
> > Here are some problems I have identified:
> >
> > 1. meson dist internally runs gzip without the -n option. That makes
> > the tar.gz archive include a timestamp, which in turn makes it not
> > reproducible.

It doesn't look like Python provides the facilities to affect this.

> > 2. Because gzip includes a platform indicator in the archive, the
> > produced tar.gz archive is not reproducible across platforms. (I don't
> > know if gzip has an option to avoid that. git archive uses an internal
> > gzip implementation that handles this.)

Same reason as above.

> > 3. Meson does not support tar.bz2 archives.

Submitted https://github.com/mesonbuild/meson/pull/12770.

> > 4. Meson uses git archive internally, but then unpacks and repacks the
> > archive, which loses the ability to use git get-tar-commit-id.

Because Meson allows projects to distribute arbitrary files via
meson.add_dist_script(), and can include subprojects via `meson dist
--include-subprojects`, this doesn't seem like an easily solvable
problem.

> > 5. I have found that the tar archives created by meson and git archive
> > include the files in different orders. I suspect that the Python
> > tarfile module introduces some either randomness or platform dependency.

Seems likely.

> > 6. meson dist is also slower because of the additional work.

Not easily solvable due to 4.

> > 7. meson dist produces .sha256sum files but we have called them .sha256.
> > (This is obviously trivial, but it is something that would need to be
> > dealt with somehow nonetheless.)
> >
> > Most or all of these issues are fixable, either upstream in Meson or by
> > adjusting our own requirements. But for now this route would have some
> > significant disadvantages.
>
> Thanks Peter. I will bring these up with upstream!

I think the solution to point 4 is to not unpack/repack if there are no
dist scripts and/or subprojects to distribute. I can take a look at
this later. I think this would also solve points 1, 2, 5, and 6 because
at that point meson is just calling git-archive.

--
Tristan Partin
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-01-24 18:05:15 Re: cleanup patches for incremental backup
Previous Message Andrey Borodin 2024-01-24 17:51:49 Re: UUID v7