Re: [HACKERS] make async slave to wait for lsn to be replayed

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com, smithpb2250(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Date: 2024-03-16 07:02:43
Message-ID: CALj2ACXcdm0DJbauuFiHAmhw7gQAZsvDhhVWHYYBQwhm6rR57w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 16, 2024 at 2:04 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>
> > Rebase and update patch.

Thanks for working on this. I took a quick look at v11 patch. Here are
some comments:

1.
+#include "utils/timestamp.h"
+#include "executor/spi.h"
+#include "utils/fmgrprotos.h"

Please place executor/spi.h in the alphabetical order. Also, look at
all other header files and place them in the order.

2. It seems like pgindent is not happy with
src/backend/access/transam/xlogrecovery.c and
src/backend/commands/waitlsn.c. Please run it to keep BF member koel
happy post commit.

3. This patch changes, SQL explicit transaction statement syntax, is
it (this deviation) okay from SQL standard perspective?

4. I think some more checks are needed for better input validations.

4.1 With invalid LSN succeeds, shouldn't it error out? Or at least,
add a fast path/quick exit to WaitForLSN()?
BEGIN AFTER '0/0';

4.2 With an unreasonably high future LSN, BEGIN command waits
unboundedly, shouldn't we check if the specified LSN is more than
pg_last_wal_receive_lsn() error out?
BEGIN AFTER '0/FFFFFFFF';
SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
BEGIN AFTER :'future_receive_lsn';

4.3 With an unreasonably high wait time, BEGIN command waits
unboundedly, shouldn't we restrict the wait time to some max value,
say a day or so?
SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
BEGIN AFTER :'future_receive_lsn' WITHIN 100000;

5.
+#include <float.h>
+#include <math.h>
+#include "postgres.h"
+#include "pgstat.h"

postgres.h must be included at the first, and then the system header
files, and then all postgres header files, just like below. See a very
recent commit https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=97d85be365443eb4bf84373a7468624762382059.

+#include "postgres.h"

+#include <float.h>
+#include <math.h>

+#include "access/transam.h"
+#include "access/xact.h"
+#include "access/xlog.h"

6.
+/* Set all latches in shared memory to signal that new LSN has been replayed */
+void
+WaitLSNSetLatches(XLogRecPtr curLSN)
+{

I see this patch is waking up all the waiters in the recovery path
after applying every WAL record, which IMO is a hot path. Is the
impact of this change on recovery measured, perhaps using
https://github.com/macdice/redo-bench or similar tools?

7. In continuation to comment #6, why not use Conditional Variables
instead of proc latches to sleep and wait for all the waiters in
WaitLSNSetLatches?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-03-16 10:24:46 Re: Introduce XID age and inactive timeout based replication slot invalidation
Previous Message Bharath Rupireddy 2024-03-16 04:25:53 Add TAP tests for backtrace functionality (was Re: Add test module for verifying backtrace functionality)