Re: Making type Datum be 8 bytes everywhere

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Making type Datum be 8 bytes everywhere
Date: 2025-07-23 18:03:06
Message-ID: dsxh2ug6bw773kdttf2yxgpedid4y6tmtt3hrho6wc3p24e2pj@cpwnxvqmift6
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-07-18 13:24:32 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2025-07-17 20:09:57 -0400, Tom Lane wrote:
> >> I made it just as a proof-of-concept that this can work. It compiled
> >> cleanly and passed check-world for me on a 32-bit FreeBSD image.
>
> > Interestingly it generates a *lot* of warnings here when building for 32 bit
> > with gcc.
>
> Oh, that's annoying. I tested it with
>
> $ cc --version
> FreeBSD clang version 13.0.0 (git(at)github(dot)com:llvm/llvm-project.git llvmorg-13.0.0-0-gd7b669b3a303)
> Target: i386-unknown-freebsd13.1
> Thread model: posix
> InstalledDir: /usr/bin
>
> which is slightly back-rev but not that old. Which gcc did you use?

That was gcc 14.

> > One class of complaints is about DatumGetPointer() and
> > PointerGetDatum() casting between different sizes:
>
> > ../../../../../home/andres/src/postgresql/src/include/postgres.h: In function 'DatumGetPointer':
> > ../../../../../home/andres/src/postgresql/src/include/postgres.h:320:16: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
> > 320 | return (Pointer) X;
> > | ^
> > ../../../../../home/andres/src/postgresql/src/include/postgres.h: In function 'PointerGetDatum':
> > ../../../../../home/andres/src/postgresql/src/include/postgres.h:330:16: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
> > 330 | return (Datum) X;
> > | ^
>
> We might be able to silence those with intermediate casts to uintptr_t,
> perhaps?

Yep, that does the trick.

> >> I've not looked into the performance consequences. We probably
> >> should at least try to measure that, though I'm not sure what
> >> our threshold of pain would be for deciding not to do this.
>
> > From my POV the threshold would have to be rather high for backend code. Less
> > so in libpq, but that's not affected here.
>
> I don't know if it's "rather high" or not, but that seems like
> the gating factor that ought to be checked before putting in
> more work.

The hard bit would be to determine what workload to measure. Something like
pgbench probably won't suffer meaningfully, there's just not enough passing of
values around.

For a bit I thought it'd need to be a workload that does a lot of int4 math or
such, but I doubt the overhead of it matters sufficiently there.

Then I realized that the biggest issue probably would be a query that does a
lot of tuple deforming of 4 byte values, while not actually accessing them?

Can you think of a worse workload than that?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-07-23 18:16:09 Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Previous Message Álvaro Herrera 2025-07-23 17:59:52 Re: trivial grammar refactor