From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_stop_backup(wait_for_archive := true) on standby server |
Date: | 2017-07-11 00:28:40 |
Message-ID: | CAD21AoCAYwM_zSZdUQ59BjhPGJFSZxA0srAEgi9C=Hv=yjAB_A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 10, 2017 at 11:56 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Sat, Jul 8, 2017 at 12:50 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> On Fri, Jul 7, 2017 at 3:48 PM, Michael Paquier
>> <michael(dot)paquier(at)gmail(dot)com> wrote:
>>> So I would suggest the following things to address this issue:
>>> 1) Generate a backup history file for backups taken at recovery as well.
>>> 2) Archive it if archive_mode = always.
>>> 3) Wait for both the segment of the stop point and the backup history
>>> files to be archived before returning back to the caller of
>>> pg_stop_backup.
>>>
>>> It would be nice to get all that addressed in 10. Thoughts?
>>
>> Yeah, I agree. Attached patch makes it works and deals with the
>> history file issue.
>
> I had a look at the proposed patch. Here are some comments.
Thank you for reviewing the patch!
>
> @@ -11002,10 +11000,10 @@ do_pg_stop_backup(char *labelfile, bool
> waitforarchive, TimeLineID *stoptli_p)
> if (waitforarchive && XLogArchivingActive())
> {
> XLByteToPrevSeg(stoppoint, _logSegNo);
> - XLogFileName(lastxlogfilename, ThisTimeLineID, _logSegNo);
> + XLogFileName(lastxlogfilename, stoptli, _logSegNo);
>
> On a standby the wait phase should not happen if archive_mode = on,
> but only if archive_mode = always. So I would suggest to change this
> upper condition a bit, and shuffle a bit the code to make the wait
> phase happen last:
> 1) stoptli_p first.
> 2) Check for XLogArchivingActive or XLogArchivingAlways, then with the
> NOTICE message.
> 3) Do the actual wait.
Thank you for the suggestion. Fixed.
> This way the code doing the wait does not need to be in its long
> lengthy if() branch. I think that we should replace the pg_usleep()
> call with a latch to make this more responsive. That should be a
> future patch.
Yeah, I agree.
> In backup history files generated in standbys, the STOP TIME is not
> set and this results in garbage in the file.
Fixed.
> + If second parameter is true and on standby, <function>pg_stop_backup</>
> + waits for WAL to be archived without forcibly switching WAL on standby.
> + So enforcing manually a WAL switch on primary needs to happen.
> Here is a reformulation:
> If the second parameter wait_for_archive is true and the backup is
> taken on a standby, pg_stop_backup waits for WAL to be archived when
> archive_mode = always. Enforcing manually a WAL segment switch to
> happen with for example pg_switch_wal() may be necessary if the
> primary has low activity to allow the backup to complete. Using
> statement_timeout to limit the amount of time to wait or switching
> wait_for_archive to false will control the wait time, though all the
> WAL segments necessary to recover into a consistent state from the
> backup taken may not be archived at the time pg_stop_backup returns
> its status to the caller.
Thanks, fixed.
> The errhint() for the wait phase should be reworked for a standby. I
> would suggest for errmsg() the same thing, aka:
> pg_stop_backup still waiting for all required WAL segments to be
> archived (%d seconds elapsed)
> But the following for a backup started in recovery. That's long but
> things need to be really clear to the user:
> Backup has been taken from a standby, check if the WAL segments needed
> for this backup have been completed, in which case a forced segment
> switch may can be needed on the primary. If a segment switch has
> already happened check that your archive_command is executing
> properly. pg_stop_backup can be canceled safely, but the database
> backup will not be usable without all the WAL segments.
Fixed.
Attached updated version patch. Please review it.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
pg_stop_backup_on_standby_v3.patch | application/octet-stream | 7.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2017-07-11 00:59:00 | Re: New partitioning - some feedback |
Previous Message | Masahiko Sawada | 2017-07-11 00:21:13 | Re: pgsql: Allow multiple hostaddrs to go with multiple hostnames. |