Re: [HACKERS] make async slave to wait for lsn to be replayed

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, dilipbalaut(at)gmail(dot)com, smithpb2250(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Date: 2024-03-20 09:11:13
Message-ID: CAPpHfduLwN+vg7a4G8B-VXv9M0vrdvpr6C7-afPOR9=UrOTJew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 20, 2024 at 12:34 AM Kartyshov Ivan <i(dot)kartyshov(at)postgrespro(dot)ru>
wrote:
> > 4.2 With an unreasonably high future LSN, BEGIN command waits
> > unboundedly, shouldn't we check if the specified LSN is more than
> > pg_last_wal_receive_lsn() error out?

I think limiting wait lsn by current received lsn would destroy the whole
value of this feature. The value is to wait till given LSN is replayed,
whether it's already received or not.

> > BEGIN AFTER '0/FFFFFFFF';
> > SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
> > BEGIN AFTER :'future_receive_lsn';
>
> This case will give ERROR cause '0/FFFFFFFF' + 1 is invalid pg_lsn

FWIW,

# SELECT '0/FFFFFFFF'::pg_lsn + 1;
?column?
----------
1/0
(1 row)

But I don't see a problem here. On the replica, it's out of our control to
check which lsn is good and which is not. We can't check whether the lsn,
which is in future for the replica, is already issued by primary.

For the case of wrong lsn, which could cause potentially infinite wait,
there is the timeout and the manual query cancel.

> > 4.3 With an unreasonably high wait time, BEGIN command waits
> > unboundedly, shouldn't we restrict the wait time to some max value,
> > say a day or so?
> > SELECT pg_last_wal_receive_lsn() + 1 AS future_receive_lsn \gset
> > BEGIN AFTER :'future_receive_lsn' WITHIN 100000;
>
> Good idea, I put it 1 day. But this limit we should to discuss.

Do you think that specifying timeout in milliseconds is suitable? I would
prefer to switch to seconds (with ability to specify fraction of second).
This was expressed before by Alexander Lakhin.

> > 6.
> > +/* Set all latches in shared memory to signal that new LSN has been
> > replayed */
> > +void
> > +WaitLSNSetLatches(XLogRecPtr curLSN)
> > +{
> >
> > I see this patch is waking up all the waiters in the recovery path
> > after applying every WAL record, which IMO is a hot path. Is the
> > impact of this change on recovery measured, perhaps using
> > https://github.com/macdice/redo-bench or similar tools?

Ivan, could you do this?

> > 7. In continuation to comment #6, why not use Conditional Variables
> > instead of proc latches to sleep and wait for all the waiters in
> > WaitLSNSetLatches?
>
> Waiters are stored in the array sorted by LSN. This help us to wake
> only PIDs with replayed LSN. This saves us from scanning of whole
> array. So it`s not so hot path.

+1
This saves us from ConditionVariableBroadcast() every time we replay the
WAL record.

> Add some fixes
>
> 1) make waiting timeont more simple (as pg_terminate_backend())
> 2) removed the 1 minute wait because INTERRUPTS don’t arrive for a
> long time, changed it to 0.5 seconds

I don't see this change in the patch. Normally if a process gets a signal,
that causes WaitLatch() to exit immediately. It also exists immediately on
query cancel. IIRC, this 1 minute timeout is needed to handle some extreme
cases when an interrupt is missing. Other places have it equal to 1
minute. I don't see why we should have it different.

> 3) add more tests
> 4) added and expanded sections in the documentation

I don't see this in the patch. I see only a short description
in func.sgml, which is definitely not sufficient. We need at least
everything we have in the docs before to be adjusted with the current
approach of procedure.

> 5) add default variant of timeout
> pg_wait_lsn(trg_lsn pg_lsn, delay int8 DEFAULT 0)
> example: pg_wait_lsn('0/31B1B60') equal pg_wait_lsn('0/31B1B60', 0)

Does zero here mean no timeout? I think this should be documented. Also,
I would prefer to see the timeout by default. Probably one minute would be
good for default.

> 6) now big timeout will be restricted to 1 day (86400000ms)
> CALL pg_wait_lsn('0/34FB5A1',10000000000);
> WARNING: Timeout for pg_wait_lsn() restricted to 1 day

I don't think we need to mention individuals, who made proposals, in the
source code comments. Otherwise, our source code would be a crazy mess of
names. Also, if this is the restriction, it has to be an error. And it
should be a proper full ereport().

------
Regards,
Alexander Korotkov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-03-20 09:25:11 Re: REVOKE FROM warning on grantor
Previous Message Pavel Stehule 2024-03-20 08:44:56 Re: Schema variables - new implementation for Postgres 15