Re: make dist using git archive

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Eli Schwartz <eschwartz93(at)gmail(dot)com>, Tristan Partin <tristan(at)neon(dot)tech>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: make dist using git archive
Date: 2024-01-31 08:03:55
Message-ID: ea89b229-c22a-4188-a619-c3bf1824078b@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26.01.24 22:18, Eli Schwartz wrote:
> Hello, meson developer here.

Hello, and thanks for popping in!

>> 3. Meson does not support tar.bz2 archives.
>
> Simple enough to add, but I'm a bit surprised as usually people seem to
> want either gzip for portability or xz for efficient compression.

We may very well end up updating our requirements here before too long,
so I wouldn't bother with this on the meson side. Last time we
discussed this, there were still platforms under support that didn't
have xz easily available.

>> 4. Meson uses git archive internally, but then unpacks and repacks the
>> archive, which loses the ability to use git get-tar-commit-id.
>
> What do you use this for? IMO a more robust way to track the commit used
> is to use gitattributes export-subst to write a `.git_archival.txt` file
> containing the commit sha1 and other info -- this can be read even after
> the file is extracted, which means it can also be used to bake the ID
> into the built binaries e.g. as part of --version output.

It's a marginal use case, for sure. But it is something that git
provides tooling for that is universally available. Any alternative
would be an ad-hoc solution that is specific to our project and would be
different for the next project.

>> 5. I have found that the tar archives created by meson and git archive
>> include the files in different orders.  I suspect that the Python
>> tarfile module introduces some either randomness or platform dependency.
>
> Different orders is meaningless, the question is whether the order is
> internally consistent. Python uses sorted() to guarantee a stable order,
> which may be a different algorithm than the one git-archive uses to
> guarantee a stable order. But the order should be stable and that is
> what matters.

(FWIW, I couldn't reproduce this anymore, so maybe it's not actually an
issue.)

> Overall I feel like much of this is about requiring dist tarballs to be
> byte-identical to other dist tarballs, although reproducible builds is
> mainly about artifacts, not sources, and for sources it doesn't
> generally matter unless the sources are ephemeral and generated
> on-demand (in which case it is indeed very important to produce the same
> tarball each time).

The source tarball is, in a way, also an artifact.

I think it's useful that others can easily independently verify that the
produced tarball matches what they have locally. It's not an absolute
requirement, but given that it is possible, it seems useful to take
advantage of it.

In a way, this also avoids the need for signing the tarball, which we
don't do. So maybe that contributes to a different perspective.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-01-31 08:16:27 Re: Reducing output size of nodeToString
Previous Message Ashutosh Bapat 2024-01-31 07:59:26 Re: table inheritance versus column compression and storage settings