Re: [proposal] de-TOAST'ing using a iterator

From: Binguo Bao <djydewang(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Владимир Лесков <vladimirlesk(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [proposal] de-TOAST'ing using a iterator
Date: 2019-09-23 13:55:24
Message-ID: CAL-OGktjn9wHLHhEem58qJXMBUAVWa8-VrDoCmceAxXGTUM_BA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> 于2019年9月17日周二 上午5:51写道:

> On 2019-Sep-10, Binguo Bao wrote:
>
> > +/*
> > + * Support for de-TOASTing toasted value iteratively. "need" is a
> pointer
> > + * between the beginning and end of iterator's ToastBuffer. The marco
> > + * de-TOAST all bytes before "need" into iterator's ToastBuffer.
> > + */
> > +#define PG_DETOAST_ITERATE(iter, need)
> \
> > + do {
>
> \
> > + Assert((need) >= (iter)->buf->buf && (need) <=
> (iter)->buf->capacity); \
> > + while (!(iter)->done && (need) >= (iter)->buf->limit) {
> \
> > + detoast_iterate(iter);
> \
> > + }
>
> \
> > + } while (0)
> > /* WARNING -- unaligned pointer */
> > #define PG_DETOAST_DATUM_PACKED(datum) \
> > pg_detoast_datum_packed((struct varlena *) DatumGetPointer(datum))
>
> In broad terms this patch looks pretty good to me. I only have a small
> quibble with this API definition in fmgr.h -- namely that it forces us
> to export the definition of all the structs (that could otherwise be
> private to toast_internals.h) in order to satisfy callers of this macro.
> I am wondering if it would be possible to change detoast_iterate and
> PG_DETOAST_ITERATE in a way that those details remain hidden -- I am
> thinking something like "if this returns NULL, then iteration has
> finished"; and relieve the macro from doing the "->buf->buf" and
> "->buf->limit" checks. I think changing that would require a change in
> how the rest of the code is structured around this (the textpos internal
> function), but seems like it would be better overall.
>
> (AFAICS that would enable us to expose much less about the
> iterator-related structs to detoast.h -- you should be able to move the
> struct defs to toast_internals.h)
>
> Then again, it might be just wishful thinking, but it seems worth
> considering at least.
>
> --
> Álvaro Herrera https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

I've tied to hide the details of the struct in patch v11 with checking
"need" pointer
inside detoast_iterate function. I also compared the performance of the two
versions.

patch v10 patch v11
comp. beg. 1413ms 1489ms
comp. end 24327ms 28011ms
uncomp. beg. 1439ms 1432ms
uncomp. end 25019ms 29007ms

We can see that v11 is about 15% slower than v10 on suffix queries since
this involves
the complete de-TOASTing and detoast_iterate() function is called
frequently in v11.

Personally, I prefer patch v10. Its performance is superior, although it
exposes some struct details.

--
Best regards,
Binguo Bao

Attachment Content-Type Size
0001-de-TOASTing-using-a-iterator-11.patch text/x-patch 21.6 KB
init-test.sh application/x-shellscript 704 bytes
iterator-test.sh application/x-shellscript 649 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-09-23 15:12:13 Re: Commit fest 2019-09
Previous Message John Bester 2019-09-23 13:43:34 Proposal: Better query optimization for "NOT IN" clause