Re: Updated backup APIs for non-exclusive backups

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: David Steele <david(at)pgmasters(dot)net>
Cc: Marco Nenciarini <marco(dot)nenciarini(at)2ndquadrant(dot)it>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Updated backup APIs for non-exclusive backups
Date: 2016-03-29 18:09:41
Message-ID: CABUevEypV2P3GnG-K7y+oYm65OJ1dyTBGL2vKo57Y8wb6Lw53g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 29, 2016 at 6:40 PM, Magnus Hagander <magnus(at)hagander(dot)net>
wrote:

>
>
> On Tue, Mar 29, 2016 at 4:36 PM, David Steele <david(at)pgmasters(dot)net> wrote:
>
>> On 3/22/16 12:31 PM, Magnus Hagander wrote:
>>
>> On Tue, Mar 22, 2016 at 5:27 PM, David Steele <david(at)pgmasters(dot)net
>>> <mailto:david(at)pgmasters(dot)net>> wrote:
>>>
>> >
>>
>>> > Adding the stop time column should be a simple addition and I
>>> don't see
>>> > a problem with that. I think I misunderstood your original request
>>> on
>>> > that. Because you are just talking about returning a timestamptz
>>> with
>>> > the "right now" value for when you called pg_stop_backup()? Or to
>>> be
>>> > specific, just before pg_Stop_backup *finished*. Or do you mean
>>> when
>>> > pg_stop_backup() started?
>>>
>>> What would be ideal is the minimum time that could be used for
>>> PITR. In
>>> an exclusive backup that's the time the end-of-backup record is
>>> written
>>> to WAL. In a non-exlusive backup I'm not quite sure how that works.
>>>
>>
>> I guess I was hoping that you would know. I fine with just getting the
>> current timestamp as is currently done in do_pg_stop_backup(). It's not
>> perfect but it will be pretty close.
>>
>> I thought some more about putting STOP_WAL_LOCATION into the backup label
>> and I think this is an important step. Without that the recovery process
>> needs to use pg_control to determine when the database has reach
>> consistency and that will only work if pg_control was copied last.
>>
>> In summary, I think the patch should be updated to include stop_time as a
>> column and add STOP_WAL_LOCATION and STOP_TIME to backup_label. The
>> recovery process should read STOP_WAL_LOCATION from backup_label and use
>> that to decide when the database has become consistent.
>>
>
> Is that the only reason we need pg_control to be copied last? I thought
> there were other reasons for that.
>
> I was chatting with Stephen about this earlier, and one thing we
> considered was that maybe we should return pg_control as a bytea field from
> pg_stop_backup() thereby strongly hinting to tools that they should write
> it out from there rather than copy it from the data directory (we can't
> stop them from copying it from the data directory like we do for
> backup_label of course, but if we return the contents and document that's
> how it should be done, it's pretty clear).
>
> But if we can actually remove the whole requirement on the order of the
> copy of pg_control by doing that it certainly seems worthwhile.
>
>
> So - I can definitely see the argument for returning the stop wal
> *location*. But I'm still not sure what the definition of the time would
> be? We can't return it before we know what it means...
>

I had a chat with Heikki, and here's another suggestion:

1. We don't touch the current exclusive backups at all, as previously
discussed, other than deprecating their use. For backwards compat.

2. For new backups, we return the contents of pg_control as a bytea from
pg_stop_backup(). We tell backup programs they are supposed to write this
out as pg_control.backup, *not* as pg_control.

3a. On recovery, if it's an exlcusive backup, we do as we did before.

3b. on recovery, in non-exclusive backups (determined from backup_label),
we check that pg_control.backup exists *and* that pg_control does *not*
exist. That guards us reasonably against backup programs that do the wrong
thing, and we know we get the correct version of pg_control.

4. (we can still add the stop location to the backup_label file in case
backup programs find it useful, but we don't use it in recovery)

Thoughts about this approach?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-03-29 18:09:42 Re: Move PinBuffer and UnpinBuffer to atomics
Previous Message Robert Haas 2016-03-29 18:09:17 Re: incorrect docs for pgbench / skipped transactions