Re: when set track_commit_timestamp on, database system abort startup

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: 李海龙 <hailong(dot)li(at)qunar(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: when set track_commit_timestamp on, database system abort startup
Date: 2018-09-15 10:26:39
Message-ID: CAD21AoAxSNorp3TjvJhrOAk+8q5yshSnW-n8buwz4bdU7qOtPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Sep 15, 2018 at 12:29 AM, Alvaro Herrera
<alvherre(at)2ndquadrant(dot)com> wrote:
> On 2018-Sep-15, Masahiko Sawada wrote:
>
>> On Fri, Sep 14, 2018 at 4:27 PM, 李海龙 <hailong(dot)li(at)qunar(dot)com> wrote:
>
>> > When I enable the parameter track_commit_timestamp in postgresql.conf of a
>> > Base Backup (making a Base Backup from a standby and the
>> > track_commit_timestamp is off on it),
>>
>> In addition to the above operation, I've reproduced this issue by
>> replaying a commit WAL record that sets the timestamp to a new page
>> during the crash recovery (or from restart).
>>
>> It seems to me that the cause of this is that we could not extend
>> commitTs page since the COMMIT_TS_ZEROPAGE WAL wasn't generated at the
>> standby server whose track_commit_timestamp is off. So during
>> replaying the commit WAL record the startup process fails since the
>> corresponding commitTs page doesn't exist.
>
> Hmm, wow. I wonder if it's possible to detect the config difference
> early enough that the zeropage WAL records are emitted, instead. But
> even this might not work, since some transactions need to have their
> commitTS in pages that will not have been zeroed anyway, because the
> page threshold was crossed in the old primary.
>
>> To fix that maybe we can disable commitTs if
>> controlFile->track_commit_timestamp == false and the
>> track_commit_timestamp == true even in crash recovery.
>
> Hmm, so keep it off while crash recovery runs, and once it's out of that
> then enable it automatically?

Yes. The attached patch changes it to check
controlFile->track_commit_timestamp even the crash recover case. If
track_commit_timestamp is set to true in the config file, it's enabled
at end of the recovery.

> That might work -- by definition we don't
> care about the commit TSs of the transaction replayed during crash
> recovery, since they were executed in the primary that didn't have
> commitTS enable anyway.
>
> It seems like the first thing we need is TAP cases that reproduce these
> two crash scenarios.

I attached TAP test that reproduces this issue. We can reproduce it
even with single server; making postgres replay a commit WAL in the
crash recovery after consumed transactions and enabled
track_commit_timestamp.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
fix_committs_recovery.patch text/x-patch 838 bytes
add_committs_recovery_test.patch text/x-patch 1.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Olivier Gautherot 2018-09-15 13:05:46 Re: Code of Conduct plan
Previous Message denty 2018-09-15 08:11:12 Re: Delta Materialized View Refreshes?