Re: A WalSnd issue related to state WALSNDSTATE_STOPPING

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Paul Guo <pguo(at)pivotal(dot)io>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: A WalSnd issue related to state WALSNDSTATE_STOPPING
Date: 2018-11-22 05:29:44
Message-ID: 20181122052944.GI3369@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 21, 2018 at 04:09:41PM +0900, Michael Paquier wrote:
> The checkpointer initializes a shutdown checkpoint where it tells to all
> the WAL senders to stop once all the children processes are gone, so it
> seems to me that there is little point in processing
> SyncRepReleaseWaiters() when a WAL sender is in WALSNDSTATE_STOPPING
> state as at this stage syncrep makes little sense. It is still
> necessary to process standby messages at this stage so as the primary
> can shut down when it is sure that all the standbys have flushed the
> shutdown checkpoint record of the primary.

Just refreshed my memory with c6c33343, which is actually at the origin
of the issue, and my previous argument is flawed. If a WAL sender has
reached WALSNDSTATE_STOPPING no regular backends are around but a WAL
sender could always commit a transaction in parallel which may need to
make sure that its record is flushed and sync'ed, and this needs to make
sure that waiters are correctly released. So it is necessary to patch
up SyncRepGetSyncStandbysPriority and SyncRepGetSyncStandbysQuorum as
mentioned upthread, perhaps adding a comment when looking at
MyWalSnd->state looks adapted. Paul, would you like to write a patch?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2018-11-22 05:41:15 Re: performance statistics monitoring without spamming logs
Previous Message Andrew Dunstan 2018-11-22 04:32:07 Re: [RFC] Removing "magic" oids