Re: Making type Datum be 8 bytes everywhere

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Joe Conway <mail(at)joeconway(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Making type Datum be 8 bytes everywhere
Date: 2025-08-09 18:54:11
Message-ID: e7129a29-c10e-4906-8a53-ff07d8868298@dunslane.net
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2025-08-09 Sa 7:45 AM, Joe Conway wrote:
> On 8/8/25 21:14, Tom Lane wrote:
>> I have just realized that this proposal has a rather nasty defect.
>> Per the following comment in spgist_private.h:
>>
>>   * If the prefix datum is of a pass-by-value type, it is stored in its
>>   * Datum representation, that is its on-disk representation is of
>> length
>>   * sizeof(Datum).  This is a fairly unfortunate choice, because in
>> no other
>>   * place does Postgres use Datum as an on-disk representation; it
>> creates
>>   * an unnecessary incompatibility between 32-bit and 64-bit builds. 
>> But the
>>   * compatibility loss is mostly theoretical since MAXIMUM_ALIGNOF
>> typically
>>   * differs between such builds, too.  Anyway we're stuck with it now.
>>
>> This means we cannot change sizeof(Datum), nor reconsider the
>> pass-by-value classification of any datatype, without potentially
>> breaking pg_upgrade of some SP-GiST indexes on 32-bit machines.
>>
>> Now, it looks like this doesn't affect any in-core SP-GiST opclasses.
>> The only one using a potentially affected type is kd_point_ops which
>> uses a float8 prefix.  That'll have been stored in regular on-disk
>> format on a 32-bit machine, but if we redefine it as being stored
>> in 64-bit-Datum format, nothing actually changes.  The case that
>> would be problematic is a prefix type that's 4 bytes or less, and
>> I don't see any.
>>
>> A quick search of Debian Code Search doesn't find any extensions
>> that look like they are using small pass-by-value prefixes either.
>> So maybe we can get away with just changing this, but it's worrisome.
>>
>> On the positive side, even if there are any SP-GiST opclasses that
>> are at risk, the population of installations using them on 32-bit
>> installs has got to be pretty tiny.
>
> I bet it is indistinguishable from zero...
>
>> And the worst-case answer is that you'd have to reindex such indexes
>> after pg_upgrade.
>
> ...and this seems like a reasonable answer if anyone pops up.
>
>> BTW, I don't think we can teach pg_upgrade to check for this
>> hazard, because the SP-GiST APIs are such that the data type
>> used for prefixes isn't visible at the SQL level.
>>
>> Do we think that making this change is valuable enough to justify
>> taking such a risk?
>
> yes +1
>
>

Agree to all the above

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kirill Reshke 2025-08-09 19:54:58 Re: VM corruption on standby
Previous Message Peter Eisentraut 2025-08-09 17:18:39 plpython: Remove support for major version conflict detection