Re: Problem while setting the fpw with SIGHUP

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, hlinnaka <hlinnaka(at)iki(dot)fi>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Problem while setting the fpw with SIGHUP
Date: 2018-07-28 13:35:22
Message-ID: CAA4eK1JvKxsHfM6GCHoKNas-7vDSniLgaAm=cG_OaQaOYRNc3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 20, 2018 at 6:06 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Fri, Apr 20, 2018 at 11:40 AM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> By the way, I think I found a bug of FPW.
>>
>> The following steps yields INSERT record that doesn't have a FPI
>> after a checkpoint.
>>
>> (Start server with full_page_writes = off)
>> CREATE TABLE t (a int);
>> CHECKPOINT;
>> INSERT INTO t VALUES (1);
>> ALTER SYSTEM SET full_page_writes TO on;
>> SELECT pg_reload_conf();
>> CHECKPOINT;
>> INSERT INTO t VALUES (1);
>>
>> The last insert is expected to write a record with FPI but it
>> doesn't actually. No FPI will be written for the page after that.
>>
>> It seems that the reason is that XLogInsertRecord is forgetting
>> to check doPageWrites' update.
>>
>> In the failure case, fpw_lsn has been set by XLogRecordAssemble
>> but doPageWrite is false at the time and it considers that no FPI
>> is required then it sets fpw_lsn to InvalidXLogRecPtr.
>>
>> After that, XLogInsertRecord receives the record but it thinks
>> that doPageWrites is true since it looks the shared value.
>>
>>> if (fpw_lsn != InvalidXLogRecPtr && fpw_lsn <= RedoRecPtr && doPageWrites)
>>
>> So this line thinks that "no FPI is omitted in this record" but
>> actually the record is just forgotten to attach them.
>>
>> The attached patch lets XLogInsertRecord check if doPageWrites
>> has been turned on after the last call and cause recomputation in
>> the case.
>>
>
> I think you have correctly spotted the problem and your fix looks good
> to me. As this is a separate problem and fix is different from what
> we are discussing here, I think this can be committed it separately.
>

AFAICS, this problem has been introduced by commit
2c03216d831160bedd72d45f712601b6f7d03f1c, so we should backpatch till
9.5. Please find attached the slightly modified patch for this bug.
I have modified one of the comments in the patch and the proposed
commit message. Can you please once cross-verify if the problem exits
till 9.5 and that this patch fixes it? Also, I don't see an easy way
to write a test for it, do you have anything in mind?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
Correctly-attach-FPI-to-the-first-record.2.patch application/octet-stream 3.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-07-28 13:40:24 Re: Problem while setting the fpw with SIGHUP
Previous Message Quan Tran 2018-07-28 09:31:05 Re: Segfault logical replication PG 10.4