pgsql: Fix race conditions and missed wakeups in syncrep worker signali

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Fix race conditions and missed wakeups in syncrep worker signali
Date: 2017-06-30 18:57:24
Message-ID: E1dR16a-0005zy-Gr@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Fix race conditions and missed wakeups in syncrep worker signaling.

When a sync worker is waiting for the associated apply worker to notice
that it's in SYNCWAIT state, wait_for_worker_state_change() would just
patiently wait for that to happen. This generally required waiting for
the 1-second timeout in LogicalRepApplyLoop to elapse. Kicking the worker
via its latch makes things significantly snappier.

While at it, fix race conditions that could potentially result in crashes:
we can *not* call logicalrep_worker_wakeup_ptr() once we've released the
LogicalRepWorkerLock, because worker->proc might've been reset to NULL
after we do that (indeed, there's no really solid reason to believe that
the LogicalRepWorker slot even belongs to the same worker anymore).
In logicalrep_worker_wakeup(), we can just move the wakeup inside the
lock scope. In process_syncing_tables_for_apply(), a bit more code
rearrangement is needed.

Also improve some nearby comments.

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/1f201a818a5910a37530cc929bd345688f827942

Modified Files
--------------
src/backend/replication/logical/launcher.c | 12 ++-
src/backend/replication/logical/tablesync.c | 156 ++++++++++++++++------------
2 files changed, 100 insertions(+), 68 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Eisentraut 2017-06-30 19:47:55 Re: pg_ctl wait exit code (was Re: [COMMITTERS] pgsql: Additional tests for subtransactions in recovery)
Previous Message Peter Eisentraut 2017-06-30 18:51:36 pgsql: Fix typo in comment