Re: [HACKERS] make async slave to wait for lsn to be replayed

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>, dilipbalaut(at)gmail(dot)com, smithpb2250(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Date: 2024-03-11 10:44:53
Message-ID: CAPpHfdsZ8hJiYaRxTbnKqVee_Nmbb+PVd+QtWifoBYkX71f0dA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

I've decided to put my hands on this patch.

On Thu, Mar 7, 2024 at 2:25 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> +1 for the second one not only because it avoids new words in grammar
> but also sounds to convey the meaning. I think you can explain in docs
> how this feature can be used basically how will one get the correct
> LSN value to specify.

I picked the second option and left only the AFTER clause for the
BEGIN statement. I think this should be enough for the beginning.

> As suggested previously also pick one of the approaches (I would
> advocate the second one) and keep an option for the second one by
> mentioning it in the commit message. I hope to see more
> reviews/discussions or usage like how will users get the LSN value to
> be specified on the core logic of the feature at this stage. IF
> possible, state, how real-world applications could leverage this
> feature.

I've added a paragraph to the docs about the usage. After you made
some changes on primary, you run pg_current_wal_insert_lsn(). Then
connect to replica and run 'BEGIN AFTER lsn' with the just obtained
LSN. Now you're guaranteed to see the changes made to the primary.

Also, I've significantly reworked other aspects of the patch. The
most significant changes are:
1) Waiters are now stored in the array sorted by LSN. This saves us
from scanning of wholeper-backend array.
2) Waiters are removed from the array immediately once their LSNs are
replayed. Otherwise, the WAL replayer will keep scanning the shared
memory array till waiters wake up.
3) To clean up after errors, we now call WaitLSNCleanup() on backend
shmem exit. I think this is preferable over the previous approach to
remove from the queue before ProcessInterrupts().
4) There is now condition to recheck if LSN is replayed after adding
to the shared memory array. This should save from the race
conditions.
5) I've renamed too generic names for functions and files.

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
v8-0001-Implement-AFTER-clause-for-BEGIN-command.patch application/octet-stream 22.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-03-11 10:46:50 Re: Regardign RecentFlushPtr in WalSndWaitForWal()
Previous Message Amit Kapila 2024-03-11 10:39:27 Re: Introduce XID age and inactive timeout based replication slot invalidation