Re: Pluggable toaster

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Greg Stark <stark(at)mit(dot)edu>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: Pluggable toaster
Date: 2022-06-30 20:26:46
Message-ID: CAN-LCVOruhrrrK+wNAF=6kPkhWfgik6+MdYZ1yJARf2qfHc-qA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!
Here is the patch set rebased onto current master (15 rel beta 2 with
commit from 29.06).
Just to remind:
In Pluggable TOAST we suggest a way to make TOAST pluggable as Storage (in
a way like Pluggable Access Methods) - we extracted
TOAST mechanics from Heap AM, and made it an independent pluggable and
extensible part with our freshly developed TOAST API.
With this patch set you will be able to develop and plug in your own TOAST
mechanics for table columns. Knowing internals and/or workflow and workload
of data being TOASTed makes Custom Toasters much more efficient in
performance and storage.
We keep backwards compatibility and default TOAST mechanics works as it
worked previously, working silently with any Toastable datatype
(and TOASTed values and tables from previous versions, no changes in this)
and set as default Toaster is not stated otherwise, but through our TOAST
API.
TOAST API does not have any noticeable overhead in comparison to the
original (master). Proofs in our research materials (measured).

We've already presented out work at HighLoad, PgCon and PgConf conferences,
you can find materials here
http://www.sai.msu.su/~megera/postgres/talks/

We have ready to plug in extension Toasters
- bytea appendable toaster for bytea datatype (impressive speedup with
bytea append operation)
- JSONB toaster for JSONB (very cool performance improvements when dealing
with TOASTed JSONB)
and prototype Toasters (in development) for PostGIS (much faster then
default with geometric data), large binary objects
(like pg_largeobject, but much, much larger, and without existing large
object limitations), default Toaster implementation without using Indexes.

Patch set consists of 9 incremental patches:
0001_create_table_storage_v4.patch - SQL syntax fix for CREATE TABLE
clause, processing SET STORAGE... correctly;

0002_toaster_interface_v7.patch - TOAST API interface and SQL syntax
allowing creation of custom Toaster (CREATE TOASTER ...)
and setting Toaster to a table column (CREATE TABLE t (data bytea STORAGE
EXTERNAL TOASTER bytea_toaster);)

0003_toaster_default_v6.patch - Default TOAST implemented via TOAST API;

0004_toaster_snapshot_v6.patch - refactoring of Default TOAST and support
for versioned Toast rows;

0005_bytea_appendable_toaster_v6.patch - contrib module
bytea_appendable_toaster - special Toaster for bytea datatype with
customized append operation;

0006_toasterapi_docs_v2.patch - documentation package for Pluggable TOAST;

0007_fix_alignment_of_custom_toast_pointers_v2.patch - fixes custom toast
pointer's
alignment required by bytea toaster by Nikita Glukhov;

0008_fix_toast_tuple_externalize_v2.patch - fixes toast_tuple_externalize
function
not to call toast if old data is the same as new one.

0009_bytea_contrib_and_varlena_v1.patch - several late fixups for 0005.

This patch set opens the following issues:
1) With TOAST independent of AM it is used by it makes sense to move
compression from AM into Toaster and make Compression one of Toaster's
options.
Actually, Toasters allow to use any compression methods independently of AM;
2) Implement default Toaster without using Indexes (currently in
development)?
3) Allows different, SQL-accessed large objects of almost infinite size IN
DATABASE, unlike current large_object functionality and does not limit
their quantity;
4) Several already developed Toasters show impressive results for
datatypes they were designed for.

We're gladly appreciate your feedback!

--
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

On Thu, Jun 23, 2022 at 4:53 PM Nikita Malakhov <hukutoc(at)gmail(dot)com> wrote:

> Hi,
> Alexander, thank you for your feedback and willingness to help. You can
> send a suggested fixup in this thread, I'll check the issue
> you've mentioned.
>
> Best regards,
> Nikita Malakhov
>
> On Thu, Jun 23, 2022 at 4:38 PM Aleksander Alekseev <
> aleksander(at)timescale(dot)com> wrote:
>
>> Hi Nikita,
>>
>> > We're currently working on rebase along other TOAST improvements, hope
>> to do it in time for July CF.
>> > Thank you for your patience.
>>
>> Just to clarify, does it include the dependent "CREATE TABLE ( ..
>> STORAGE .. )" patch [1]? I was considering changing the patch
>> according to the feedback it got, but if you are already working on
>> this I'm not going to interfere.
>>
>> [1]: https://postgr.es/m/de83407a-ae3d-a8e1-a788-920eb334f25b%40sigaev.ru
>> --
>> Best regards,
>> Aleksander Alekseev
>
>

Attachment Content-Type Size
0005_bytea_appendable_toaster_v6.patch.gz application/x-gzip 5.6 KB
0003_toaster_default_v6.patch.gz application/x-gzip 29.9 KB
0002_toaster_interface_v7.patch.gz application/x-gzip 46.6 KB
0001_create_table_storage_v4.patch.gz application/x-gzip 4.2 KB
0004_toaster_snapshot_v6.patch.gz application/x-gzip 8.1 KB
0006_toasterapi_docs_v1.patch.gz application/x-gzip 3.9 KB
0008_fix_toast_tuple_externalize_v2.patch.gz application/x-gzip 583 bytes
0006_toasterapi_docs_v2.patch.gz application/x-gzip 3.9 KB
0009_bytea_contrib_and_varlena_v1.patch.gz application/x-gzip 3.9 KB
0007_fix_alignment_of_custom_toast_pointers_v2.patch.gz application/x-gzip 801 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2022-06-30 20:27:47 Re: Pluggable toaster
Previous Message Michel Pelletier 2022-06-30 20:07:53 Re: PATCH: Add Table Access Method option to pgbench