Re: Setting pd_lower in GIN metapage

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Setting pd_lower in GIN metapage
Date: 2017-09-26 02:34:31
Message-ID: CAA4eK1JsUDe64Uiq_uigKKwu_pig=0KHQ12xS4Uxe4voPQuSNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 25, 2017 at 12:18 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hi.
>
> Trying to catch up.
>
> On 2017/09/25 13:43, Michael Paquier wrote:
>> On Sun, Sep 24, 2017 at 2:25 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> Added and updated the comments for both btree and hash index patches.
>>
>> I don't have real complaints about this patch, this looks fine to me.
>>
>> + * Currently, the advantage of setting pd_lower is in limited cases like
>> + * during wal_consistency_checking or while logging for unlogged relation
>> + * as for all other purposes, we initialize the metapage. Note, it also
>> + * helps in page masking by allowing to mask unused space.
>> I would have reworked this comment a bit, say like that:
>> Setting pd_lower is useful for two cases which make use of WAL
>> compressibility even if the meta page is initialized at replay:
>> - Logging of init forks for unlogged relations.
>> - wal_consistency_checking logs extra full-page writes, and this
>> allows masking of the unused space of the page.
>>
>> Now I often get complains that I suck at this exercise ;)
>
> So, I think I understand the discussion so far and the arguments about
> what we should write to explain why we're setting pd_lower to the correct
> value.
>
> Just to remind, I didn't actually start this thread [1] to address the
> issue that the FPWs of meta pages written to WAL are not getting
> compressed. An external backup tool relies on pd_lower to give the
> correct starting offset of the hole to compress, provided the page's other
> fields suggest it has the standard page layout. Since we discovered that
> pd_lower is not set correctly in gin, brin, and spgist meta pages, I
> created patches to do the same. You (Michael) pointed out that that
> actually helps compress their FPW images in WAL as well, so we began
> considering that. Also, you pointed out that WAL checking code masks
> pages based on the respective masking functions' assumptions about the
> page's layout properties, which the original patches forgot to consider.
> So, we updated the patches to change the respective masking functions to
> mask meta pages as pages with standard page layout, too.
>
> But as Tom pointed out [2], WAL compressibility benefit cannot be obtained
> unless we change how the meta page is passed to xlog.c to be written to
> WAL. So, we found out all the places that write/register the meta page to
> WAL and changed the respective XLogRegisterBuffer calls to include the
> REGBUG_STANDARD flag. Some of these calls already passed
> REGBUF_WILL_INIT, which would result in no FPW image to be written to the
> WAL and so there would be no question of compressibility. But, passing
> REGBUF_STANDARD would still help the case where WAL checking is enabled,
> because FPW image *is* written in that case.
>
> So, ISTM, comments that the patches add should all say that setting the
> meta pages' pd_lower to the correct value helps to pass those pages to
> xlog.c as compressible standard layout pages, regardless of whether they
> are actually passed that way. Even if the patches do take care of the
> latter as well.
>
> Did I miss something?
>
> Looking at Amit K's updated patches for btree and hash, it seems that he
> updated the comment to say that setting pd_lower to the correct value is
> *essential*, because those pages are passed as REGBUF_STANDARD pages and
> hence will be compressed. That seems to contradict what I wrote above,
> but I think that's not wrong, too.
>

I think you are missing that there are many cases where we use only
REGBUF_STANDARD for meta-pages (cf. hash index). For btree, where the
advantage is in fewer cases, I have explicitly stated those cases.

Do you still have confusion?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-09-26 02:38:17 Re: Built-in plugin for logical decoding output
Previous Message Michael Paquier 2017-09-26 02:14:15 Re: Shaky coding for vacuuming partitioned relations