Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: Hannu Krosing <hannuk(at)google(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem)
Date: 2025-07-05 01:11:32
Message-ID: aGh7xCegTRJCdQ9q@paquier.xyz
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 04, 2025 at 02:38:34PM +0300, Nikita Malakhov wrote:
> Hannu, we'd already made an attempt to extract the TOAST functionality as
> API and make it extensible and usable by other AMs in [1], the patch
> set was met calmly but we still have some hopes on it.

Yeah, it's one of these I have studied, and just found that
overcomplicated, preventing us from moving on with a simpler proposal,
because I care about two things first:
- More compression methods, with more meta-data, but let's just add
more vartag_external for that once/if they're really required.
- Enlarge optionally to 8-byte values.
So I really want to stress about these two points, nothing else for
now, echoing from the feedback from 2022 and the fact that all
proposals done after that lacked a simple approach.

IMO, we would live fine enough, *if* being able to plug in a pluggable
TOAST engine makes sense, if we just limit ourselves with an external
interface. We could allow backends to load their own vartag_external
with their own set of callbacks like the ones I am proposing here, so
as we can translate from/to a Datum in heap (or a different table AM)
to an external source, with the backend able to understand what this
external source should be. The key is to define a structure good
enough for the backend (toast_external_data in the patch). So to
answer your and Hannu's question: I had the case of different table
AMs in mind with an interface able to plug into it, yes. And yes, I
have designed the patch set with this in scope. Now there's also a
data type component to that, so that's assuming that a table AM would
want to rely on a varlena to store this data externally, somewhere
else that may not be exactly TOAST, still we want an OID and a value
to be able to retrieve this external value, and we want to store this
external OID and this value (+extra like a compression method and
sizes) in a Datum of the main relation file.

FYI, the patch set posted on this thread is not the latest one. I
have a v2, posted on this branch, where I have reordered things:
https://github.com/michaelpq/postgres/tree/toast_64bit_v2

The refactoring to the new toast_external_data with its callbacks is
done first, and the new vartag_external with 8-byte value support is
added on top of that. There were still two things I wanted to do, and
could not get down to it because I've spent my last week or so
working on other's stuff so I lacked room:
- Evaluate the cost of the transfer layer to toast_external_data. The
worst case I was planning to work with is a non-compressed data stored
in TOAST, then check profiles with the the detoasting path by grabbing
slices of the data with pgbench and a read-only query. The
write/insert path is not going to matter, the detoast is. The
reordering is actually for this reason: I want to see the effect of
the new interface first, and this needs to happen before we even
consider the option of adding 8-byte values.
- Add a callback for the value ID assignment. I was hesitating to add
that when I first backed on the patch but I think that's the correct
design moving forward, with an extra logic to be able to check if an
8-byte value is already in use in a relation, as we do for OID
assignment, but applied to the Toast generator added to the patch.
The backend should decide if a new value is required, we should not
decide the rewrite cases in the callback.

There is a second branch that I use for development, force-pushing to
it periodically, as well:
https://github.com/michaelpq/postgres/tree/toast_64bit
That's much dirtier, always WIP, just one of my playgrounds.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-07-05 01:21:00 Re: A assert failure when initdb with track_commit_timestamp=on
Previous Message Noah Misch 2025-07-05 00:16:28 walwriter can set XLP_BKP_REMOVABLE wrongly: race w/ backup start