Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "Burd, Greg" <greg(at)burd(dot)me>, Nikita Malakhov <hukutoc(at)gmail(dot)com>, Hannu Krosing <hannuk(at)google(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Date: 2025-08-08 21:02:46
Message-ID: 7gsu2pkoecyvivilvwbogn7zszexxgl76fkzhw5moznko7czbw@zs6tmbre3ote
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2025-08-08 15:32:16 -0400, Tom Lane wrote:
> Michael Paquier <michael(at)paquier(dot)xyz> writes:
> > Attached is a v4, due to conflicts mainly caused by the recent changes
> > in varatt.h done by e035863c9a04.
>
> I found some time to look at the v4 patchset, and have a bunch of
> comments of different sizes.
>
> 0001:
>
> I'm good with widening all these values to 64 bits, but I wonder
> if it's a great idea to use unadorned "uint64" as the data type.
> That's impossible to grep for, doesn't convey anything much about
> what the variables are, etc. I'm tempted to propose instead
> inventing a typedef "BigOid" or some such name (bikeshedding
> welcome). The elog's could be handled with, say,
> #define PRIBO PRIu64
> This suggestion isn't made with the idea that we'd someday switch
> to an even wider type, but just with the idea of making it clearer
> what these values are being used for. When you see "Oid" you
> know it's some sort of object identifier, and I'm sad to give
> that up here.

I think we should consider introducing Oid64, instead of having a toast
specific type. I doubt this is the last place that we'll want to use 64 bit
wide types for cataloged entities.

> 0004:
>
> Shouldn't VARATT_EXTERNAL_GET_POINTER go away entirely?
> It looks to me like every use of that should be replaced by
> toast_external_info_get_data().
>
> I wonder if we shouldn't try to get rid of the phraseology "standard
> TOAST pointer", and instead write something like "short TOAST pointer"
> or "small TOAST pointer". These aren't really going to be more
> "standard" than the wider ones, IMO.
>
> I don't like replacing "va_valueid" with just "value". Dropping
> the "id" is not an improvement, because now a reader might be
> confused about whether this is somehow the actual value of the
> toasted datum.
>
> Functions in toast_external.c are under-documented. Some lack
> any header comment whatever, and others have one that doesn't
> actually say what they do.
>
> I kind of wonder whether the run-time indirection this design causes
> is going to be a performance problem. Perhaps not, given the expenses
> involved in accessing a toasted value, but it has a faint feeling
> of overdesign.

+1

> 0008:
>
> I totally hate the idea of introducing a GUC for this. This should be
> user-transparent. Or if it isn't user-transparent, a GUC is still
> the wrong thing; some kind of relation option would be more sensible.

Agreed. I think we need backward compatibility for pg_upgrade purposes, but
that doesn't require a GUC, it just requires an option when creating tables in
binary upgrade mode.

> 0009:
>
> I do not love this either. I think the right thing is just to widen
> the existing nextOid counter to 64 bits, and increment it in-memory as
> 64 bits, and then return either all 64 bits when a 64-bit Oid is asked
> for, or just the low-order 32 bits when a 32-bit OID is asked for
> (with appropriate hacking for 32-bit wraparound). Having a separate
> counter will make it too easy for the same value to be produced by
> nearby calls to the 32-bit and 64-bit Oid generators, which is likely
> a bad thing, if only because of potential confusion.

I'm not convinced that the global counter, be it a 32 or a 64 bit, approach
has merit in general, and I'm rather sure it's the wrong thing for toast
values. There's no straightforward path to move away from the global counter
for plain oids, but I would suggest simply not using the global oid counter
for toast IDs.

A large portion of the cases I've seen where toast ID assignments were a
problem were when the global OID counter wrapped around due to activity on
*other* tables (and/or temporary table creation). If you instead had a
per-toast-table sequence for assigning chunk IDs, that problem would largely
vanish.

With 64bit toast IDs we shouldn't need to search the index for a
non-conflicting toast IDs, there can't be wraparounds (we'd hit wraparound of
LSNs well before that and that's not practically reachable).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-08-08 21:12:58 Re: Custom pgstat support performance regression for simple queries
Previous Message Tom Lane 2025-08-08 20:55:01 Re: Datum as struct