Re: Making type Datum be 8 bytes everywhere

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Making type Datum be 8 bytes everywhere
Date: 2025-07-30 16:06:31
Message-ID: 1520348.1753891591@sss.pgh.pa.us
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ getting back to looking at this ]

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2025-07-17 20:09:57 -0400, Tom Lane wrote:
>> The attached patch switches to 8-byte Datums everywhere, but
>> doesn't make any effort to remove the now-dead code.

> Thanks for writing that!
> Interestingly it generates a *lot* of warnings here when building for 32 bit
> with gcc.

Yeah, I can reproduce that if I use gcc. Interesting that casting
between pointer and different-size integer is a default warning on gcc
but not clang. The major stumbling block to cleaning that up seems
to be what you noted:

> A fourth class is passing a Datum to VAR* macros. They're a bit too verbose to
> paste here, but it's just a variation of the above. I'm not really sure what
> our intended use of them is, do we intend to pass pointers or datums to the
> macros? I suspect we'd have to move the casts into the varlena macros,
> otherwise we'll have to add DatumGetPointer() uses all over.

Right, we have for a long time not worried about whether VARDATA and
the allied macros are being fed a pointer or a Datum. I recall that
somebody tried to make those macros into static inlines awhile back,
and failed because of the lack of clarity about how to declare their
arguments. I think the way forward here is to tackle that head-on
and split the top-level macros into two static inlines, along the
lines of
VARDATA(Pointer ptr)
and
VARDATA_D(Datum dat)
where the _D versions are simply DatumGetPointer and then call the
non-D versions.

I'm giving the traditional names to the Pointer variants because it
turns out that way more places would have to change if we do it the
other way: in a rough count, about 50 versus about 1700. (This is
counting only the core backend.) Beyond that, though, bikeshedding
on the naming is welcome.

> One of these days I should again try the experiment of making Datum into a
> struct, to automatically catch omissions of datum <-> native type.

It looks like the main remaining thing we'd need to change for that
to be possible is to replace "(Datum) 0" with some macro, say
"INVALID_DATUM". I don't especially want to do that though: there
are upwards of 600 occurrences in our tree, so the consequences for
back-patch hazards seem to me to outweigh the benefit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2025-07-30 16:13:06 Re: Support getrandom() for pg_strong_random() source
Previous Message Nathan Bossart 2025-07-30 15:52:03 Re: teach pg_upgrade to handle in-place tablespaces