Re: [proposal] de-TOAST'ing using a iterator

From: John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Binguo Bao <djydewang(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Владимир Лесков <vladimirlesk(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [proposal] de-TOAST'ing using a iterator
Date: 2020-03-25 10:04:53
Message-ID: CACPNZCvJszzRvQ-no9dhbqg2JprURMZ58ipjHiQX4XZGeRJD6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 13, 2020 at 10:19 PM Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
>
> So let's move forward with v10 (submitted on Sept 10th).

In the attached v12, based on v10, I've made some progress to address
some of the remaining issues. There's still some work to be done, in
particular to think about how to hide the struct details better, as
mentioned by you and Tomas back in September, but wanted to put this
much out there to keep things moving.

> Rather than cross-including header files, it seems better
> to expose some struct definitions after all and let the main iterator
> interface (detoast_iterate) be a "static inline" function in detoast.h.

The cross-include is gone, and detoast_iterate is now static inline.

> Looking at that version, I don't think the function protos that were put
> in toast_internals.h should be there at all; I think they should be in
> detoast.h so that they can be used.

Done.

> But I don't like the fact that
> detoast.h now has to include genam.h; that seems pretty bogus. I think
> this can be fixed by moving the FetchDatumIteratorData struct definition
> (but not its typedef) to toast_internals.h.

I took a stab at this, but I ended up playing whack-a-mole with
compiler warnings. I'll have to step back and try again later.

> OTOH we've recently seen the TOAST interface (and header files) heavily
> reworked because of table-AM considerations, so probably this needs even
> more changes to avoid parts of it becoming heapam-dependant again.

Haven't thought about this.

> create_toast_buffer() doing two pallocs seems a waste. It could be a
> single one,
> + buf = (ToastBuffer *) palloc0(MAXALIGN(sizeof(ToastBuffer)) + size);
> + buf->buf = buf + MAXALIGN(sizeof(ToastBuffer));
> (I'm not sure that the MAXALIGNs are strictly necessary there; I think
> we access the buf as four-byte aligned stuff somewhere in the toast
> innards, but maybe I'm wrong about that.)

I tried this briefly and got backend crashes, and didn't try to analyze further.

In addition, I brought in the memcpy() and comment changes in
c60e520f6e from common/pg_lzcompress.c to pglz_decompress_iterate(). I
also made a typo correction in the former, which could be extracted
into a separate patch if this one is not ready in time.

For this comment back in [1]:

> This arrangement where there are two ToastBuffers and they sometimes are
> the same is cute, but I think we need a better way to know when each
> needs to be freed afterwards; the proposed coding is confusing. And it
> certainly it needs more than zero comments about what's going on there.

One idea is to test if the pointers are equal via a macro, rather than
setting and testing a member bool var. And I agree commentary could be
improved in this area.

[1] https://www.postgresql.org/message-id/20190903201226.GA16197%40alvherre.pgsql

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
v12-detoasting-using-an-iterator.patch application/octet-stream 21.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-03-25 10:12:29 Re: error context for vacuum to include block number
Previous Message Dean Rasheed 2020-03-25 08:57:31 Re: Some improvements to numeric sqrt() and ln()