Re: when set track_commit_timestamp on, database system abort startup

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: 李海龙 <hailong(dot)li(at)qunar(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: when set track_commit_timestamp on, database system abort startup
Date: 2018-09-14 15:29:38
Message-ID: 20180914152938.yqgztwotlu4ogxsk@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-Sep-15, Masahiko Sawada wrote:

> On Fri, Sep 14, 2018 at 4:27 PM, 李海龙 <hailong(dot)li(at)qunar(dot)com> wrote:

> > When I enable the parameter track_commit_timestamp in postgresql.conf of a
> > Base Backup (making a Base Backup from a standby and the
> > track_commit_timestamp is off on it),
>
> In addition to the above operation, I've reproduced this issue by
> replaying a commit WAL record that sets the timestamp to a new page
> during the crash recovery (or from restart).
>
> It seems to me that the cause of this is that we could not extend
> commitTs page since the COMMIT_TS_ZEROPAGE WAL wasn't generated at the
> standby server whose track_commit_timestamp is off. So during
> replaying the commit WAL record the startup process fails since the
> corresponding commitTs page doesn't exist.

Hmm, wow. I wonder if it's possible to detect the config difference
early enough that the zeropage WAL records are emitted, instead. But
even this might not work, since some transactions need to have their
commitTS in pages that will not have been zeroed anyway, because the
page threshold was crossed in the old primary.

> To fix that maybe we can disable commitTs if
> controlFile->track_commit_timestamp == false and the
> track_commit_timestamp == true even in crash recovery.

Hmm, so keep it off while crash recovery runs, and once it's out of that
then enable it automatically? That might work -- by definition we don't
care about the commit TSs of the transaction replayed during crash
recovery, since they were executed in the primary that didn't have
commitTS enable anyway.

It seems like the first thing we need is TAP cases that reproduce these
two crash scenarios.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-09-14 15:35:54 Re: pgsql: Improve autovacuum logging for aggressive and anti-wraparound ru
Previous Message Chris Travers 2018-09-14 15:24:11 Re: Code of Conduct plan