Re: High CPU consumption in cascade replication with large number of walsenders

From: Alexey Makhmutov <a(dot)makhmutov(at)postgrespro(dot)ru>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: High CPU consumption in cascade replication with large number of walsenders
Date: 2025-11-06 23:17:07
Message-ID: 26d020da-430c-4b68-96f1-45a86de1b250@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Alexander!

Thank you again for the attention to this patch!

> Could you, please, also comment change from check for
AllowCascadeReplication() to StandbyWithCascadeReplication()? Do you
think this is beneficial and saves us from sending the notifications
when they are useless?

The original intention of this check was to avoid enabling all the
timer-based machinery for cases, when we definitely don't need to send
notifications. So, we try to avoid potential impact of the patch on such
cases even if new options are enabled. For example, this may happen if
we restore server from backup and apply WAL archive on it (as
PerformWalRecovery will be invoked in this case as well). Both
'hot_standby' and 'wal_senders' parameters are enabled by default, so
even primary server may pass the the 'AllowCascadeReplication' condition
in such case. So, we want to be sure that we the 'StandbyMode' is
actually set and we are part of Startup process.

However, the change may be also reasonable by itself, to avoid calling
WalSndWakeup at all in such cases (thus avoiding acquiring/releasing CV
mutex), although I do not think that it will provide measurable
improvements.

> Also, could you comment this condition.
> if (cascadeReplicationMaxBatchSize <= 1 && appliedRecords == 0)
> Does this mean that if batching was disabled in config then enforced
by SIGHUP, we will still wait for the current batch to be completed?
Would it be better to stop batching immediately?

Sure, if either of 'cascade_replication_batch_size' or
'cascade_replication_batch_delay' is changed, then we need to flush
current batch (send notification) and then decide whether we need to
perform batching for next records. This is why we set
'replicationNotificationPending' flag in
'assign_cascade_replication_batch_values' (and disable timer if it is
set), so it could be processed in final part
'ProcessStartupProcInterrupts'. So, technically we should not find
ourselves in the situation when we have 'cascadeReplicationMaxBatchSize'
set to 1 and appliedRecords set to non-zero value. This check for
'appliedRecords' value seems to be a remnant of an my intermediate patch
version, where these values were already runtime modifiable, but not yet
processed 'assign_cascade_replication_batch_values'. I think it could be
safely removed to avoid confusion. Thank you for noticing this!

> Also, this patch lacks documentation. I would especially like to see
combinations of GUCs described (cascade_replication_batch_size is
enabled, but cascade_replication_batch_delay disabled, and vise versa).

I've added the documentation for these two parameters in the new version
of the patch (config.sgml). The new patch version also contains the
change for minimal parameter value for 'cascade_replication_batch_size'
- now minimal value is 1 to indicate disabled batching. In previous
version both '0' and '1' values were valid options to disable batching,
but this was looking ambiguous in the documentation. So, I've decided to
leave only '1' as valid value to make it simpler to describe. I've also
rebased the patch on top of current master to fix failures during the
build checks.

Thanks,
Alexey

Attachment Content-Type Size
v3-0001-Implement-batching-for-WAL-records-notification-duri.patch text/x-patch 15.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Treat 2025-11-06 23:29:28 Re: alter check constraint enforceability
Previous Message David Rowley 2025-11-06 23:05:43 Re: another autovacuum scheduling thread