RFI: Extending the TOAST Pointer

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RFI: Extending the TOAST Pointer
Date: 2023-05-10 20:04:57
Message-ID: CAN-LCVMq2X=fhx7KLxfeDyb3P+BXuCkHC0g=9GF+JD4izfVa0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

There were several discussions where the limitations of the existing TOAST
pointers
were mentioned [1], [2] and [3] and from time to time this topic appears in
other places.

We proposed a fresh approach to the TOAST mechanics in [2], but
unfortunately the
patch was met quite unfriendly, and after several iterations was rejected,
although we
still have hopes for it and have several very promising features based on
it.

Anyway, the old TOAST pointer is also the cause of problems like [4], and
this part of
the PostgreSQL screams to be revised and improved.

The TOAST begins with the pointer to the externalized value - the TOAST
Pointer, which
is very limited in means of storing data, and all TOAST improvements
require revision
of this Pointer structure. So we decided to ask the community for thoughts
and ideas on
how to rework this pointer.
The TOAST Pointer (varatt_external structure) stores 4 fields:
[varlena header][<4b - original data size><4b - size in TOAST table><4b -
TOAST table OID><4b - ID of chunk>]
In [2] we proposed the new Custom TOAST pointer structure where main
feature is
extensibility:
[varlena header][<2b - total size of the TOAST pointer><4b size of original
data><4b - OID of algorithm used for TOASTing><variable length field used
for storing any custom data>]
where Custom TOAST Pointer is distinguished from Regular one by va_flag
field which
is a part of varlena header, so new pointer format does not interfere with
the old (regular) one.
The first field is necessary because the Custom TOAST pointer has variable
length due to the
tail used for inline storage, and original value size is used by the
Executor. The third field could
be a subject for discussion.

Thoughts? Objections?

[1] [PATCH] Infinite loop while acquiring new TOAST Oid
<https://www.postgresql.org/message-id/CAJ7c6TPSvR2rKpoVX5TSXo_kMxXF%2B-SxLtrpPaMf907tX%3DnVCw%40mail.gmail.com>
[2] Pluggable Toaster
<https://www.postgresql.org/message-id/flat/224711f9-83b7-a307-b17f-4457ab73aa0a%40sigaev.ru>
[3] [PATCH] Compression dictionaries for JSONB
<https://www.postgresql.org/message-id/flat/CAJ7c6TOtAB0z1UrksvGTStNE-herK-43bj22%3D5xVBg7S4vr5rQ%40mail.gmail.com>
[4] BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in
the table.
<https://www.postgresql.org/message-id/flat/16722-93043fb459a41073%40postgresql.org>

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2023-05-10 21:00:31 Re: Unlinking Parallel Hash Join inner batch files sooner
Previous Message Bruce Momjian 2023-05-10 19:37:24 Re: Subscription suborigin?