Re: Function to promote standby servers

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Function to promote standby servers
Date: 2018-10-22 09:04:34
Message-ID: CAD21AoCpOL7QEL=txF8GsF7=uvSwQjRHfZY+07X4_ddjPsgaeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 22, 2018 at 3:01 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Oct 20, 2018 at 01:48:56PM +0900, Michael Paquier wrote:
> > On Sat, Oct 20, 2018 at 06:24:28AM +0200, Laurenz Albe wrote:
> >> Here is another version, with a fix in pg_proc.dat, an improved comment
> >> and "wait_seconds" exercised in the regression test.
> >
> > Thanks for the new version. This looks pretty good to me. I'll see if
> > I can review it once and then commit.
> >
> >> - WAIT_EVENT_SYNC_REP
> >> + WAIT_EVENT_SYNC_REP,
> >> + WAIT_EVENT_PROMOTE
> >> } WaitEventIPC;
> >
> > Those are kept in alphabetical order. Individual wait events are also
> > documented with a description.
>
> Regarding the documentation, wouldn't it be more adapted to list the new
> function under the section "Recovery Control Functions"? Not only does
> the new function signal the postmaster, but it also creates the
> promotion trigger file.
>
> Anyway, I have been looking in depth at the patch, and I finish with the
> attached. Here are some notes:
> - pg_ctl returns an error if it cannot create the promote trigger file
> and if it doesn't close it. pg_promote should do the same.
> - WL_TIMEOUT could have been reached, leading to no further retries
> (remove for example the part related to the trigger file creation and
> try to trigger the promotion, the wait time is incorrect).
> - Documentation has been reworked.
> - The new wait event is documented.
> - a WARNING is returned if the signal to the postmaster is not sent,
> which is something I think makes most sense as we do the same for other
> signals.
> - position including unistd.h was not correct in xlogfuncs.c.
> - Timeouts for the tests are made a tad longer. Some buildfarm machines
> don't like a promotion wait of 10s.
> - a catalog version bump is included, just a personal reminder..
> - Indentatio has been run.
>

Thank you for workig on this. There is one review comment for the latest patch.

+ if (FreeFile(promote_file))
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not write file \"%s\": %m",
+ PROMOTE_SIGNAL_FILE)));

Maybe we should unlink PROMOTE_SIGNAL_FILE before erroring.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adelino Silva 2018-10-22 09:06:12 Re: WAL archive (archive_mode = always) ?
Previous Message Kyotaro HORIGUCHI 2018-10-22 08:12:54 Re: View to get all the extension control file details