Re: Problem while setting the fpw with SIGHUP

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: amit(dot)kapila16(at)gmail(dot)com, robertmhaas(at)gmail(dot)com, hlinnaka(at)iki(dot)fi, dilipbalaut(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem while setting the fpw with SIGHUP
Date: 2018-09-14 07:26:55
Message-ID: 20180914072655.GI29332@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 06, 2018 at 04:37:28PM -0700, Michael Paquier wrote:
> /*
> * Properly accept or ignore signals the postmaster might send us.
> */
> - pqsignal(SIGHUP, StartupProcSigHupHandler); /* reload config file */
> + pqsignal(SIGHUP, SIG_IGN); /* ignore reload config */
>
> I am finally coming back to this patch set, and that's one of the first
> things I am going to help moving on for this CF. And this bit from the
> last patch series is not acceptable as there are some parameters which
> are used by the startup process which can be reloaded. One of them is
> wal_retrieve_retry_interval for tuning when fetching WAL at recovery.

So, I have been working on this problem again and I have reviewed the
thread, and there have been many things discussed in the last couple of
months:
1) We do not want to initialize XLogInsert stuff unconditionally for all
processes at the moment recovery begins, but we just want to initialize
it once WAL write is open for business.
2) Both the checkpointer and the startup process can call
UpdateFullPageWrites() which can cause Insert->fullPageWrites to get
incorrect values.
3) We do not want a palloc() in a critical section because of
RecoveryinProgress being called.

And the root issue here is 2), because the checkpointer tries to update
Insert->fullPageWrites but it does not need to do so until recovery has
been finished. So in order to fix the original issue I am proposing a
simple fix: let's make sure that the checkpointer does not update
Insert->fullPageWrites until recovery finishes, and let's have the
startup process do the first update once it finishes recovery and
inserts by itself the XLOG_PARAMETER_CHANGE. This way the order of
events is purely sequential and we don't need to worry about having the
checkpointer and the startup process eat on each other's plate because
the checkpointer would only try to work on updating the shared memory
value of full_page_writes once SharedRecoveryInProgress is switched to
true, and that happens after the startup process does its initial call
to UpdateFullPageWrites(). I have improved as well all the comments
around to make clear the behavior wanted.

Thoughts?
--
Michael

Attachment Content-Type Size
fix_updatefullpagewrites_concurrency_v2.patch text/x-diff 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 李海龙 2018-09-14 07:27:34 when set track_commit_timestamp on, database system abort startup
Previous Message Michael Paquier 2018-09-14 07:04:48 Re: Problem while setting the fpw with SIGHUP