Re: Setting pd_lower in GIN metapage

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Setting pd_lower in GIN metapage
Date: 2017-06-20 07:53:14
Message-ID: fb626119-94bd-cbfa-fc6b-9ca260df714f@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/06/19 22:59, Amit Kapila wrote:
> On Mon, Jun 19, 2017 at 11:37 AM, Amit Langote
> <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> What are some arguments against setting pd_lower in the GIN metapage as
>> follows?
>>
>> GinMetaPageData *metad = GinPageGetMeta(page);
>>
>> ((PageHeader) page)->pd_lower =
>> ((char *) metad + sizeof(GinMetaPageData)) - (char *) page;
>>
>> I saw that _bt_initmetapage() does it, so was wondering why doesn't GIN.
>>
>
> Actually, hash index also has similar code (See _hash_init_metabuffer)
> and I see no harm in doing this at similar other places. This helps
> in compressing the hole in metapage during WAL writing. I think that
> in itself might not be an argument in favor of doing this because
> there is only one meta page for index and saving ~7K WAL is not huge
> but OTOH I don't see a reason for not doing it.

I agree. Will write a patch.

>> How about porting such a change to the back-branches if we do this at all?
>> I couldn't find any discussion in the archives about this. I read in
>> comments that server versions older than 9.4 didn't set pd_lower correctly
>> in any of GIN index pages, so relying on pd_lower value of GIN pages is
>> unreliable in general.
>>
>> The reason I'm asking is that a certain backup tool relies on pd_lower
>> values of data pages (disk blocks in relation files that are known to have
>> a valid PageHeaderData) to be correct to discard the portion of every page
>> that supposedly does not contain any useful information. The assumption
>> doesn't hold in the case of GIN metapage, so any GIN indexes contain
>> corrupted metapage after recovery (metadata overwritten with zeros).
>>
>
> Why can't you do a special check for metapage identification? It
> should be easy to add such a check. I have seen one of such tools
> (pg_filedump) has similar check to skip metapage in certain code
> paths.

I tried to add a metapage check, but getting such code to compile requires
it to include headers like gin_private.h (in PG versions < 10), which in
turn includes other headers that are forbidden to be included in what's
supposed to be FRONTEND code. (thread at [1] seems relevant in this
regard.) Another way to fix this might be to copy the GinMetaPageData
struct definition and a few other symbols into the tool's code and make
necessary checks using the same, instead of including gin_private.h.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/CA%2BTgmoZ%3DF%3DGkxV0YEv-A8tb%2BAEGy_Qa7GSiJ8deBKFATnzfEug%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-06-20 08:20:31 Re: Setting pd_lower in GIN metapage
Previous Message Amit Khandekar 2017-06-20 06:54:19 Re: UPDATE of partition key