Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, francesco(dot)canovai(at)2ndquadrant(dot)it, Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>
Subject: Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby
Date: 2016-07-09 16:54:39
Message-ID: CABUevEwpjzSuJ2zQpRBE=UgXm3PcoUA7mdGffSPw-MQ-KRADHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Jul 9, 2016 4:52 AM, "Noah Misch" <noah(at)leadboat(dot)com> wrote:
>
> On Thu, Jul 07, 2016 at 03:38:26PM +0900, Michael Paquier wrote:
> > On Thu, Jul 7, 2016 at 12:57 AM, Marco Nenciarini
> > <marco(dot)nenciarini(at)2ndquadrant(dot)it> wrote:
> > > After further analysis, the issue is that we retrieve the starttli
from
> > > the ControlFile structure, but it was using ThisTimeLineID when
writing
> > > the backup label.
> > >
> > > I've attached a very simple patch that fixes it.
> >
> > ThisTimeLineID is always set at 0 on purpose on a standby, so we
> > cannot rely on it (well it is set temporarily when recycling old
> > segments). At recovery when parsing the backup_label file there is no
> > actual use of the start segment name, so that's only a cosmetic
> > change. But surely it would be better to get that fixed, because
> > that's useful for debugging.
> >
> > While looking at your patch, I thought that it would have been
> > tempting to use GetXLogReplayRecPtr() to get the timeline ID when in
> > recovery, but what we really want to know here is the timeline of the
> > last REDO pointer, which is starttli, and that's more consistent with
> > the fact that we use startpoint when writing the backup_label file. In
> > short, +1 for this fix.
> >
> > I am adding that in the list of open items, adding Magnus in CC whose
> > commit for non-exclusive backups is at the origin of this defect.
>
> [Action required within 72 hours. This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 9.6 open item.
Magnus,
> since you committed the patch believed to have created it, you own this
open
> item. If some other commit is more relevant or if this does not belong
as a
> 9.6 open item, please let us know. Otherwise, please observe the policy
on
> open item ownership[1] and send a status update within 72 hours of this
> message. Include a date for your subsequent status update. Testers may
> discover new open items at any time, and I want to plan to get them all
fixed
> well in advance of shipping 9.6rc1. Consequently, I will appreciate your
> efforts toward speedy resolution. Thanks.
>
> [1]
http://www.postgresql.org/message-id/20160527025039.GA447393@tornado.leadboat.com

I'll take a look at this on Monday when I'm back home from Russia. It looks
like people have it under control, so hopefully that just means committing
the available solution in which case it'll be finished by then.

/Magnus

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message yrashk 2016-07-09 16:57:36 BUG #14239: Array of array type reporting
Previous Message Amit Kapila 2016-07-09 06:30:14 Re: BUG #14230: Wrong timeline returned by pg_stop_backup on a standby

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-07-09 16:59:44 Re: \timing interval
Previous Message Peter Eisentraut 2016-07-09 14:32:11 Re: \timing interval