Re: Write Ahead Logging for Hash Indexes

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Write Ahead Logging for Hash Indexes
Date: 2016-09-25 12:00:55
Message-ID: CAE9k0PmWnbYaqT02s1Y7axiGMh0C=HQdzFFy=D-pOW16Gwo1UQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,

> I forgot to mention that Ashutosh has tested this patch for a day
> using Jeff's tool and he didn't found any problem. Also, he has found
> a way to easily reproduce the problem. Ashutosh, can you share your
> changes to the script using which you have reproduce the problem?

I made slight changes in Jeff Janes patch (crash_REL10.patch) to
reproduce the issue reported by him earlier [1]. I changed the
crash_REL10.patch such that a torn page write happens only for a
bitmap page in hash index. As described by Amit in [2] & [3] we were
not logging full page image for bitmap page and that's the reason we
were not be able to restore bitmap page in case it is a torn page
which would eventually result in below error.

38422 01000 2016-09-19 16:25:50.061 PDT:WARNING: page verification
failed, calculated checksum 65067 but expected 21260
38422 01000 2016-09-19 16:25:50.061 PDT:CONTEXT: xlog redo at
3F/22053B50 for Hash/ADD_OVFL_PAGE: bmsize 4096, bmpage_found T
38422 XX001 2016-09-19 16:25:50.071 PDT:FATAL: invalid page in block
9 of relation base/16384/17334

Jeff Jane's original patch had following conditions for a torn page write:

if (JJ_torn_page > 0 && counter++ > JJ_torn_page && !RecoveryInProgress())
nbytes = FileWrite(v->mdfd_vfd, buffer, BLCKSZ/3);

Ashutosh has added some more condition's in above if( ) to ensure that
the issue associated with bitmap page gets reproduced easily:

if (JJ_torn_page > 0 && counter++ > JJ_torn_page &&
!RecoveryInProgress() && bucket_opaque->hasho_page_id == HASHO_PAGE_ID
&& bucket_opaque->hasho_flag & LH_BITMAP_PAGE)
nbytes = FileWrite(v->mdfd_vfd, buffer, 24);

Also, please note that i am allowing only 24 bytes of data i.e. just a
page header to be written into a disk file so that even if a single
overflow page is added into hash index table the issue will be
observed.

Note: You can find Jeff Jane's patch for torn page write at - [4].

[1] - https://www.postgresql.org/message-id/CAMkU%3D1zGcfNTWWxnud8j_p0vb1ENGcngkwqZgPUEnwZmAN%2BXQw%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CAA4eK1LmQZGnYhSHXDDCOsSb_0U-gsxReEmSDRgCZr%3DAdKbTEg%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAA4eK1%2Bk9tGPw7vOACRo%2B4h5n%3DutHnSuGgMoJrLANybAjNBW9w%40mail.gmail.com
[4] - https://www.postgresql.org/message-id/CAMkU%3D1xRt8jBBB7g_7K41W00%3Dbr9UrxMVn_rhWhKPLaHfEdM5A%40mail.gmail.com

With Regards,
Ashutosh Sharma
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mithun Cy 2016-09-25 12:01:53 Re: Patch: Implement failover on libpq connect level.
Previous Message Mark Kirkwood 2016-09-25 06:09:35 Re: Hash Indexes