Re: Pluggable toaster

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Jacob Champion <jchampion(at)timescale(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Pluggable toaster
Date: 2022-11-03 13:26:38
Message-ID: CAJ7c6TNumJU683DcPt4ob-eybvuWvqXC-Y+1azUv+XhrYaycgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Nikita,

Please, avoid top-posting [1].

> Toaster is set for the table column. Each TOASTable column could have
> a different Toaster, so column option is the most obvious place to add it.

This is a major limitation. IMO the user should be able to set a
custom TOASTer for the entire table as well. Ideally - for the entire
database too. This could be implemented entirely on the syntax level,
the internals of the patch are not going to be affected.

> >1.2. That's odd. TOAST should work for EXTENDED and MAIN storage
> >strategies as well. On top of that, why should custom TOASTers have
> >any knowledge of the default four-stage algorithm and the storage
> >strategies? If the storage strategy is actually ignored, it shouldn't
> >be used in the syntax.
>
> EXTENDED storage strategy means that TOASTed value is compressed
> before being TOASTed, so no knowledge of its internals could be of any
> use. EXTERNAL strategy means that value is being TOASTed in original
> form. Storage strategy is the thing internal to AM used, and TOAST
> mechanics is not meant to interfere with it. Again, STORAGE EXTERNAL
> explicitly shows that value will be stored out-of-line.

Let me rephrase. Will the custom TOASTers work only for EXTERNAL
storage strategy or this is just a syntax?

> >2. Although it's possible to implement some encryption in a TOASTer I
> >don't think the documentation should advertise this.
>
> It is a good example of what could the Toaster be responsible for

No, encryption is an excellent example of what a TOASTer should NOT
do. If you are interested in encryption consider joining the "Moving
forward with TDE" thread [2].

> >3.1. I believe we should rename this to something like `struct
> >ToastImpl`. The `Tsr` abbreviation only creates confusion, and this is
> >not a routine.
>
> It was done similar to Table AM Routine (please check Pluggable
> Storage API), along with some other decisions.

OK, then maybe we shall keep the "Routine" part for consistency. I
still don't like the "Tsr" abbreviation though and find it confusing.

> It is not clear because current TOAST mechanics does not have UPDATE
> functionality - it doesn't actually update TOASTed value, it marks this value
> "dead" and inserts a new one. This is the cause of TOAST tables bloating
> that is being complained about by many users. Update method is provided
> for implementation of UPDATE operation.

But should we really distinguish INSERT and UPDATE cases on this API
level? It seems to me that TableAM just inserts new tuples. It's
TOASTers job to figure out whether similar values existed before and
should or shouldn't be reused. Additionally a particular TOASTer can
reuse old values between _different_ rows, potentially even from
different tables. Another reason why in practice there is little use
of knowing whether the data is INSERTed or UPDATEd.

> I already answered this question, maybe the answer was not very clear.
> This is just an extension entry point, because for some more advanced
> functionality existing pre-defined set of methods would be not enough, i.e.
> special Toasters for complex datatypes like JSONb, that have complex
> internal structure and may require additional ways to interact with it along
> toast/detoast/update/delete.

Maybe so, but it doesn't change the fact that the user documentation
should clearly describe the interface and its usage.

> These too. About free() method - Toasters are not meant to be deleted,
> we mentioned this several times. They exist once they are created as long
> as the DB itself. Have I answered your question?

Users should be able to DROP extension. I seriously doubt that the
patch is going to be accepted as long as it has this limitation.

[1]: https://wiki.postgresql.org/wiki/Mailing_Lists#Email_etiquette_mechanics
[2]: https://www.postgresql.org/message-id/flat/CAOxo6XJh95xPOpvTxuP_kiGRs8eHcaNrmy3kkzWrzWxvyVkKkQ%40mail.gmail.com

--
Best regards,
Aleksander Alekseev

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-11-03 13:55:22 Re: real/float example for testlibpq3
Previous Message Peter Eisentraut 2022-11-03 13:23:00 Re: real/float example for testlibpq3