Re: Unnecessary confirm work on logical replication

From: Emre Hasegeli <emre(at)hasegeli(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Unnecessary confirm work on logical replication
Date: 2023-04-11 12:58:05
Message-ID: CAE2gYzx6H4TaoagPOCghD+Na=uKcD95PXSqvLZ5ECQUyGf=xRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> In fact, the WAL sender always starts reading WAL from restart_lsn,
> which in turn is always <= confirmed_flush_lsn. While reading WAL, WAL
> sender may read XLOG_RUNNING_XACTS WAL record with lsn <=
> confirmed_flush_lsn. While processing XLOG_RUNNING_XACTS record it may
> update its restart_lsn and catalog_xmin with current_lsn = lsn fo
> XLOG_RUNNING_XACTS record. In this situation current_lsn <=
> confirmed_flush_lsn.

This can only happen when the WAL sender is restarted. However in
this case, the restart_lsn and catalog_xmin should have already been
persisted by the previous run of the WAL sender.

I still doubt these calls are necessary. I think there is a
complicated chicken and egg problem here. Here is my logic:

1) LogicalConfirmReceivedLocation() is called explicitly when
confirmed_flush is sent by the replication client.

2) LogicalConfirmReceivedLocation() is the only place that updates
confirmed_flush.

3) The replication client can only send a confirmed_flush for a
current_lsn it has already received.

4) These two functions have already run for any current_lsn the
replication client has received.

5) These two functions call LogicalConfirmReceivedLocation() only if
current_lsn <= confirmed_flush.

Thank you for your patience.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2023-04-11 13:17:55 Re: An oversight in ExecInitAgg for grouping sets
Previous Message Jonathan S. Katz 2023-04-11 12:52:08 Re: When to drop src/tools/msvc support