Re: Pluggable toaster

From: Nikita Malakhov <hukutoc(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Pluggable toaster
Date: 2022-08-04 20:18:42
Message-ID: CAN-LCVMXN-igZdOH4kunF0p2jobsksNhFXovEXjr-zVF90Nq1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers!

I've made a rebase according to Andres and Aleksander comments.
Rebased branch resides here:
https://github.com/postgrespro/postgres/tree/toasterapi_clean

I've decided to leave only the first 2 patches for review and send less
significant
changes after the main patch will be straightened out.
So, here is
v13-0001-toaster-interface.patch - main TOAST API patch, with reference
TOAST
mechanics left as-is.
v13-0002-toaster-default.patch - reference TOAST re-implemented via TOAST
API.

>I'm a bit confused by the patch numbering - why isn't there a patch 0001
and
>0005?
Sorry for the confusion, my fault. The first patch (CREATE TABLE .. SET
STORAGE)
is already committed into v15, and several of the late patches weren't
included.
I've rearranged patch numbers in this iteration.

>I think 0002 needs to be split further - the relevant part isn't that it
>introduces the "dummy toaster" module, it's a large change doing lots of
>things, the addition of the contrib module is irrelevant comparatively.

Done, contrib /dummy_toaster excluded from main patch and placed in branch
as a separate commit.

>As is the patches unfortunately are close to unreviewable. Lots of code
gets
>moved around in one patch, then again in the next patch, then again in the
>next.

So I've decided to put here only the first one while I'm working on the
latter to clean
this up - I agree, code in latter patches needs some refactoring. Working
on it.

>Unfortunately, scanning through these patches, it seems this is a lot of
>complexity, with a (for me) comensurate benefit. There's a lot of more
general
>improvements to toasting and the json type that we can do, that are a lot
less
>complex than this.

We have very significant improvements for storing large JSON and a couple of
other TOAST improvements which make a lot of sense, but they are based on
this API. But in the first patch reference TOAST is left as-is, and does
not use
TOAST API.

>> 2) New VARATT_CUSTOM data structure with fixed header and variable
>> tail to store custom toasted data, with according macros set;

>That's adding overhead to every toast interaction, independent of any new
>infrastructure being used.

We've performed some tests on this and haven't detected significant
overhead,

>So we're increasing pg_attribute - often already the largest catalog table
in
>a database.

A little bit, with an OID column storing Toaster OID. We do not see any
other way
to keep track of Toaster used by the table's column, because it could be
changed
any time by ALTER TABLE ... SET TOASTER.

>Am I just missing something, or is atttoaster not actually used in this
patch?
>So most of the contrib module added is unreachable code?

It is necessary for Toasters implemented via TOAST API, the first patch
does not
use it directly because reference TOAST is left unchanged. The second one
which
implements reference TOAST via TOAST API uses it.

>That seems not great.

About Toasters deletion - we forbid dropping Toasters because if Toaster is
dropped
the data TOASTed with it is lost, and as was mentioned before, we think
that there
won't be a lot of custom Toasters, likely seems to be less then a dozen.

>That move the whole list around! On a cache hit. Tthis would likely
already be
>slower than syscache.

Thank you for the remark, it is questionable approach. I've changed this in
current iteration
(patch in attach) to keep Toaster list appended-only if Toaster was not
found, and leave
Toaster cache as a straight list - first element in is the head of the list.

Also, documentation on TOAST API is provided in README.toastapi in the
first patch -
I'd be grateful for comments on it.

Thanks for the feedback!

On Tue, Aug 2, 2022 at 6:37 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2022-08-02 09:15:12 +0300, Nikita Malakhov wrote:
> > Attach includes:
> > v11-0002-toaster-interface.patch - contains TOAST API with default
> Toaster
> > left as-is (reference implementation) and Dummy toaster as an example
> > (will be removed later as a part of refactoring?).
> >
> > v11-0003-toaster-default.patch - implements reference TOAST as Default
> > Toaster
> > via TOAST API, so Heap AM calls Toast only via API, and does not have
> direct
> > calls to Toast functionality.
> >
> > v11-0004-toaster-snapshot.patch - supports row versioning for TOASTed
> values
> > and some refactoring.
>
> I'm a bit confused by the patch numbering - why isn't there a patch 0001
> and
> 0005?
>
> I think 0002 needs to be split further - the relevant part isn't that it
> introduces the "dummy toaster" module, it's a large change doing lots of
> things, the addition of the contrib module is irrelevant comparatively.
>
> As is the patches unfortunately are close to unreviewable. Lots of code
> gets
> moved around in one patch, then again in the next patch, then again in the
> next.
>
>
> Unfortunately, scanning through these patches, it seems this is a lot of
> complexity, with a (for me) comensurate benefit. There's a lot of more
> general
> improvements to toasting and the json type that we can do, that are a lot
> less
> complex than this.
>
>
> > From 6b35d6091248e120d2361cf0a806dbfb161421cf Mon Sep 17 00:00:00 2001
> > From: Nikita Malakhov <n(dot)malakhov(at)postgrespro(dot)ru>
> > Date: Tue, 12 Apr 2022 18:37:21 +0300
> > Subject: [PATCH] Pluggable TOAST API interface with dummy_toaster contrib
> > module
> >
> > Pluggable TOAST API is introduced with implemented contrib example
> > module.
> > Pluggable TOAST API consists of 4 parts:
> > 1) SQL syntax supports manipulations with toasters - CREATE TABLE ...
> > (column type STORAGE storage_type TOASTER toaster), ALTER TABLE ALTER
> > COLUMN column SET TOASTER toaster and Toaster definition.
> > TOAST API requires earlier patch with CREATE TABLE SET STORAGE clause;
> > New column atttoaster is added to pg_attribute.
> > Toaster drop is not allowed for not to lose already toasted data;
> > 2) New VARATT_CUSTOM data structure with fixed header and variable
> > tail to store custom toasted data, with according macros set;
>
> That's adding overhead to every toast interaction, independent of any new
> infrastructure being used.
>
>
>
> > 4) Dummy toaster implemented via new TOAST API to be used as sample.
> > In this patch regular (default) TOAST function is left as-is and not
> > yet implemented via new API.
> > TOAST API syntax and code explanation provided in additional docs patch.
>
> I'd make this a separate commit.
>
>
>
> > @@ -445,6 +447,8 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc
> tupdesc2)
> > return false;
> > if (attr1->attstorage != attr2->attstorage)
> > return false;
> > + if (attr1->atttoaster != attr2->atttoaster)
> > + return false;
>
> So we're increasing pg_attribute - often already the largest catalog table
> in
> a database.
>
> Am I just missing something, or is atttoaster not actually used in this
> patch?
> So most of the contrib module added is unreachable code?
>
>
> > +/*
> > + * Toasters is very often called so syscache lookup and TsrRoutine
> allocation are
> > + * expensive and we need to cache them.
>
> Ugh.
>
> > + * We believe what there are only a few toasters and there is high
> chance that
> > + * only one or only two of them are heavy used, so most used toasters
> should be
> > + * found as easy as possible. So, let us use a simple list, in future
> it could
> > + * be changed to other structure. For now it will be stored in
> TopCacheContext
> > + * and never destroed in backend life cycle - toasters are never
> deleted.
> > + */
>
> That seems not great.
>
>
> > +typedef struct ToasterCacheEntry
> > +{
> > + Oid toasterOid;
> > + TsrRoutine *routine;
> > +} ToasterCacheEntry;
> > +
> > +static List *ToasterCache = NIL;
> > +
> > +/*
> > + * SearchTsrCache - get cached toaster routine, emits an error if
> toaster
> > + * doesn't exist
> > + */
> > +TsrRoutine*
> > +SearchTsrCache(Oid toasterOid)
> > +{
> > + ListCell *lc;
> > + ToasterCacheEntry *entry;
> > + MemoryContext ctx;
> > +
> > + if (list_length(ToasterCache) > 0)
> > + {
> > + /* fast path */
> > + entry = (ToasterCacheEntry*)linitial(ToasterCache);
> > + if (entry->toasterOid == toasterOid)
> > + return entry->routine;
> > + }
> > +
> > + /* didn't find in first position */
> > + ctx = MemoryContextSwitchTo(CacheMemoryContext);
> > +
> > + for_each_from(lc, ToasterCache, 0)
> > + {
> > + entry = (ToasterCacheEntry*)lfirst(lc);
> > +
> > + if (entry->toasterOid == toasterOid)
> > + {
> > + /* remove entry from list, it will be added in a
> head of list below */
> > + foreach_delete_current(ToasterCache, lc);
>
> That needs to move later list elements!
>
>
> > + goto out;
> > + }
> > + }
> > +
> > + /* did not find entry, make a new one */
> > + entry = palloc(sizeof(*entry));
> > +
> > + entry->toasterOid = toasterOid;
> > + entry->routine = GetTsrRoutineByOid(toasterOid, false);
> > +
> > +out:
> > + ToasterCache = lcons(entry, ToasterCache);
>
> That move the whole list around! On a cache hit. Tthis would likely
> already be
> slower than syscache.
>
>
> > diff --git a/contrib/dummy_toaster/dummy_toaster.c
> b/contrib/dummy_toaster/dummy_toaster.c
> > index 0d261f6042..02f49052b7 100644
> > --- a/contrib/dummy_toaster/dummy_toaster.c
> > +++ b/contrib/dummy_toaster/dummy_toaster.c
>
> So this is just changing around the code added in the prior commit. Why
> was it
> then included before?
>
>
> > +++ b/src/include/access/generic_toaster.h
>
> > +HeapTuple
> > +heap_toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple
> oldtup,
> > + int options);
>
> The generic toast API has heap_* in its name?
>
>
>
>
> > From 4112cd70b05dda39020d576050a98ca3cdcf2860 Mon Sep 17 00:00:00 2001
> > From: Nikita Malakhov <n(dot)malakhov(at)postgrespro(dot)ru>
> > Date: Tue, 12 Apr 2022 22:57:21 +0300
> > Subject: [PATCH] Versioned rows in TOASTed values for Default Toaster
> support
> >
> > Original TOAST mechanics does not support rows versioning
> > for TOASTed values.
> > Toaster snapshot - refactored generic toaster, implements
> > rows versions check in toasted values to share common parts
> > of toasted values between different versions of rows.
>
> This misses explaining *WHY* this is changed.
>
>
>
> > diff --git a/src/backend/access/common/detoast.c
> b/src/backend/access/common/detoast.c
> > deleted file mode 100644
> > index aff8042166..0000000000
> > --- a/src/backend/access/common/detoast.c
> > +++ /dev/null
>
> These patches really move things around in a largely random way.
>
>
> > -static bool toastrel_valueid_exists(Relation toastrel, Oid valueid);
> > -static bool toastid_valueid_exists(Oid toastrelid, Oid valueid);
> > -
> > +static void
> > +toast_extract_chunk_fields(Relation toastrel, TupleDesc toasttupDesc,
> > + Oid valueid, HeapTuple
> ttup, int32 *seqno,
> > + char **chunkdata, int
> *chunksize);
> > +
> > +static void
> > +toast_write_slice(Relation toastrel, Relation *toastidxs,
> > + int num_indexes, int validIndex,
> > + Oid valueid, int32 value_size, int32
> slice_offset,
> > + int32 slice_length, char *slice_data,
> > + int options,
> > + void *chunk_header, int
> chunk_header_size,
> > + ToastChunkVisibilityCheck
> visibility_check,
> > + void *visibility_cxt);
>
> What do all these changes have to do with "Versioned rows in TOASTed
> values for Default Toaster support"?
>
>
> > +static void *
> > +toast_fetch_old_chunk(Relation toastrel, SysScanDesc toastscan, Oid
> valueid,
> > + int32 expected_chunk_seq, int32
> last_old_chunk_seq,
> > + ToastChunkVisibilityCheck
> visibility_check,
> > + void *visibility_cxt,
> > + int32 *p_old_chunk_size,
> ItemPointer old_tid)
> > +{
> > + for (;;)
> > + {
> > + HeapTuple old_toasttup;
> > + char *old_chunk_data;
> > + int32 old_chunk_seq;
> > + int32 old_chunk_data_size;
> > +
> > + old_toasttup = systable_getnext_ordered(toastscan,
> ForwardScanDirection);
> > +
> > + if (old_toasttup)
> > + {
> > + /* Skip aborted chunks */
> > + if
> (!HeapTupleHeaderXminCommitted(old_toasttup->t_data))
> > + {
> > + TransactionId xmin =
> HeapTupleHeaderGetXmin(old_toasttup->t_data);
> > +
> > +
> Assert(!HeapTupleHeaderXminInvalid(old_toasttup->t_data));
> > +
> > + if (TransactionIdDidAbort(xmin))
> > + continue;
> > + }
>
> Why is there visibility logic in quite random places? Also, it's not
> "legal"
> to call TransactionIdDidAbort() without having checked
> TransactionIdIsInProgress() first. And what does this this have to do with
> snapshots - it's pretty clearly not implementing snapshot logic.
>
>
> Greetings,
>
> Andres Freund
>

--
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

Attachment Content-Type Size
v13-0002-toaster-default.patch.gz application/x-gzip 29.3 KB
v13-0001-toaster-interface.patch.gz application/x-gzip 48.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-08-04 20:47:07 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Previous Message Robert Haas 2022-08-04 20:07:01 Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints