Re: pgsql: Improve handling of parameter differences in physical replicatio

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Improve handling of parameter differences in physical replicatio
Date: 2020-03-30 10:41:43
Message-ID: 1f9fb47c-eacc-dc42-3267-20c387fa0a2f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2020/03/30 16:58, Peter Eisentraut wrote:
> Improve handling of parameter differences in physical replication
>
> When certain parameters are changed on a physical replication primary,
> this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
> record. The standby then checks whether its own settings are at least
> as big as the ones on the primary. If not, the standby shuts down
> with a fatal error.
>
> The correspondence of settings between primary and standby is required
> because those settings influence certain shared memory sizings that
> are required for processing WAL records that the primary might send.
> For example, if the primary sends a prepared transaction, the standby
> must have had max_prepared_transaction set appropriately or it won't
> be able to process those WAL records.
>
> However, fatally shutting down the standby immediately upon receipt of
> the parameter change record might be a bit of an overreaction. The
> resources related to those settings are not required immediately at
> that point, and might never be required if the activity on the primary
> does not exhaust all those resources. If we just let the standby roll
> on with recovery, it will eventually produce an appropriate error when
> those resources are used.
>
> So this patch relaxes this a bit. Upon receipt of
> XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
> warning and set a global flag if there is a problem. Then when we
> actually hit the resource issue and the flag was set, we issue another
> warning message with relevant information. At that point we pause
> recovery, so a hot standby remains usable. We also repeat the last
> warning message once a minute so it is harder to miss or ignore.

I encountered the trouble maybe related to this commit.

Firstly I set up the master and the standby with max_connections=100 (default value).
Then I decreased max_connections to 1 only in the standby and restarted
the server. Thanks to the commit, I saw the following warning message
in the standby.

WARNING: insufficient setting for parameter max_connections
DETAIL: max_connections = 1 is a lower setting than on the master server (where its value was 100).
HINT: Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.

Then I made the script that inserted 1,000,000 rows in one transaction,
and ran it 30 times at the same time. That is, 30 transactions inserting
lots of rows were running at the same time.

I confirmed that there are expected number of rows in the master,
but found 0 row in the standby unxpectedly. Also I suspected that issue
happened because recovery is paused, but pg_is_wal_replay_paused()
returned false in the standby.

Isn't this the trouble related to this commit?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Rémi Zara 2020-03-30 13:42:59 Re: pgsql: Add kqueue(2) support to the WaitEventSet API.
Previous Message David Rowley 2020-03-30 10:41:33 pgsql: Attempt to fix unstable regression tests, take 2

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajkumar Raghuwanshi 2020-03-30 10:43:47 Re: WIP/PoC for parallel backup
Previous Message Amit Kapila 2020-03-30 10:22:38 Re: WAL usage calculation patch