| From: | Hüseyin Demir <huseyin(dot)d3r(at)gmail(dot)com> |
|---|---|
| To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Cc: | Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com> |
| Subject: | Re: Don't keep closed WAL segment in page cache after replay |
| Date: | 2026-03-02 08:14:51 |
| Message-ID: | 177243929182.626.15849688898446231987.pgcf@coridan.postgresql.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
The idea looks good and efficient albeit I have some feedback. The first one is about logical replication slot.
Inside the patch, it checks if there is an active walsender process. Is it possible to create a replication slot and wait until a subscriber will connect it. During this time due to patch PostgreSQL will close the WAL segments on the memory and once the subscriber connects it has to read the WAL files from disk. But it's a trade-off and can be decided by others too.
+/*
+ * Return true if there's at least one active walsender process
+ */
+bool
+WalSndRunning(void)
+{
+ int i;
+
+ for (i = 0; i < max_wal_senders; i++)
+ {
+ WalSnd *walsnd = &WalSndCtl->walsnds[i];
+
+ SpinLockAcquire(&walsnd->mutex);
+ if (walsnd->pid > 0)
+ {
+ SpinLockRelease(&walsnd->mutex);
+ return true;
+ }
+ SpinLockRelease(&walsnd->mutex);
+ }
+ return false;
+}
+
Secondly, when it comes to using spinlock to check the running walsender processes it can lead inefficient recovery process. Because assuming that a database with max_wal_sender set to 20+ and producing more than 4-5TB WAL in a day it can cause additional +100-200 spinlocks each second on walreceiver. Put simply, WalSndRunning() scans every walsender slot with spinlocks on every segment switch, contending with all active walsenders updating their own slots. On high-throughput standbys this creates unnecessary cross-process spinlock contention in the recovery hot path — the
exact path that should be as lean as possible for fast replay. Maybe you can implement a single pg_atomic_uint32 counter in WalSndCtlData and achieve the same result with zero contention.
Regards.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Joel Jacobson | 2026-03-02 08:39:15 | Re: [BUG?] estimate_hash_bucket_stats uses wrong ndistinct for avgfreq |
| Previous Message | Jakub Wartak | 2026-03-02 08:01:05 | Re: pg_stat_io_histogram |