Re: Reading timeline from pg_control on replication slave

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reading timeline from pg_control on replication slave
Date: 2017-10-28 09:25:16
Message-ID: F468EAFB-55B8-48D1-8006-1755786D3EDC@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Michael!

Thank you very much for these comments!

> 28 окт. 2017 г., в 3:09, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> написал(а):
>
> ThisTimeLineID is not something you can rely on for standby backends
> as it is not set during recovery. That's the reason behind
> pg_walfile_name disabled during recovery. There are three things
> popping on top of my mind that one could think about:
> 1) Backups cannot be completed when started on a standby in recovery
> and when stopped after the standby has been promoted, meaning that its
> timeline has changed.
> 2) After a standby has been promoted, by using pg_start_backup, you
> issue a checkpoint which makes sure that the control file gets flushed
> with the new information, so when pg_start_backup returns to the
> caller you should have the correct timeline number because the outer
> function gets evaluated last.
> 3) Backups taken from cascading standbys, where a direct parent has
> been promoted.
>
> 1) and 2) are actually not problems per the restrictions I am giving
> above, but 3) is. If I recall correctly, when a streaming standby does
> a timeline jump, a restart point is not immediately generated, so you
> could have the timeline on the control file not updated to the latest
> timeline value, meaning that you could have the WAL file name you use
> here referring to a previous timeline and not the newest one.
>
> In short, yes, what you are doing is definitely risky in my opinion,
> particularly for complex cascading setups.

We are using TimeLineId from pg_control only to give a name to backup. Slightly stale timeline Id will not incur significant problems as long as pg_control is picked up after backup finalization.

But from your words I see that the safest option is to check timeline from pg_control after start and after stop. If this timelines differ - invalidate backup entirely. This does not seem too hard condition for invalidation, does it?

Best regards, Andrey Borodin.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-10-28 09:37:36 Re: Partition-wise aggregation/grouping
Previous Message Robert Haas 2017-10-28 09:24:19 Re: Adding table_constraint description in ALTER TABLE synopsis