Re: RFC: Pluggable TOAST

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: RFC: Pluggable TOAST
Date: 2023-11-07 10:06:41
Message-ID: CAN-LCVPw12NaAG1eiAwjgAtOa90wMBCkzAyt=evjBLNb9f=ZgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

I've been thinking about Matthias' proposals for some time and have some
questions:

>So, in short, I don't think there is a need for a specific "Pluggable
>toast API" like the one in the patchset at [0] that can be loaded
>on-demand, but I think that updating our current TOAST system to a
>system for which types can provide support functions would likely be
>quite beneficial, for efficient extraction of data from composite
>values.

As I understand one of the reasons against Pluggable TOAST is that
differences
in plugged-in Toasters could result in incompatibility even in different
versions
of the same DB.

The importance of the correct TOAST update is out of question, feel like I
have
to prepare a patch for it. There are some questions though, I'd address them
later with a patch.

>Example support functions:

>/* TODO: bikeshedding on names, signatures, further support functions. */
>Datum typsup_roastsliceofbread(Datum ptr, int sizetarget, char cmethod)
>Datum typsup_unroastsliceofbread(Datum ptr)
>void typsup_releaseroastedsliceofbread(Datump ptr) /* in case of
>non-unitary in-memory datums */

I correctly understand that you mean extending PG_TYPE and type cache,
by adding a new function set for toasting/detoasting a value in addition to
in/out, etc?

I see several issues here:
1) We could benefit from knowledge of internals of data being toasted (i.e.
in case of JSON value with key-value structure) only when EXTERNAL
storage mode is set, otherwise value will be compressed before toasted.
So we have to keep both TOAST mechanics regarding the storage mode
being used. It's the same issue as in Pluggable TOAST. Is it OK?

2) TOAST pointer is very limited in means of data it keeps, we'd have to
extend it anyway and keep both for backwards compatibility;

3) There is no API and such an approach would require implementing
toast and detoast in every data type we want to be custom toasted, resulting
in multiple files modification. Maybe we have to consider introducing such
an API?

4) 1 toast relation per regular relation. With an update mechanics this will
be less limiting, but still a limiting factor because 1 entry in base table
could have a lot of entries in the toast table. Are we doing something with
this?

>We would probably want at least 2 more subtypes of varattrib_1b_e -
>one for on-disk pointers, and one for in-memory pointers - where the
>payload of those pointers is managed by the type's toast mechanism and
>considered opaque to the rest of PostgreSQL (and thus not compatible
>with the binary transfer protocol). Types are currently already
>expected to be able to handle their own binary representation, so
>allowing types to manage parts of the toast representation should IMHO
>not be too dangerous, though we should make sure that BINARY COERCIBLE
>types share this toast support routine, or be returned to their
>canonical binary version before they are cast to the coerced type, as
>using different detoasting mechanisms could result in corrupted data
>and thus crashes.

>Lastly, there is the compression part of TOAST. I think it should be
>relatively straightforward to expose the compression-related
>components of TOAST through functions that can then be used by
>type-specific toast support functions.
>Note that this would be opt-in for a type, thus all functions that use
>that type's internals should be aware of the different on-disk format
>for toasted values and should thus be able to handle it gracefully.

Thanks a lot for answers!

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2023-11-07 10:21:02 Re: Synchronizing slots from primary to standby
Previous Message Zakhlystov, Daniil (Nebius) 2023-11-07 09:43:46 Re: Force the old transactions logs cleanup even if checkpoint is skipped