Re: Question about StartLogicalReplication() error path

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <rhaas(at)postgresql(dot)org>
Subject: Re: Question about StartLogicalReplication() error path
Date: 2021-06-11 06:22:53
Message-ID: e22a4606333ce1032e29fe2fb1aa9036e6f0ca98.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2021-06-11 at 10:13 +0530, Amit Kapila wrote:
> Because sometimes clients don't have to do anything for xlog records.
> One example is WAL for DDL where logical decoding didn't produce
> anything for the client but later with keepalive we send the LSN of
> WAL where DDL has finished and the client just responds with the
> position sent by the server as it doesn't have any other pending
> transactions.

If I understand correctly, in this situation it avoids the cost of a
write on the client just to update its stored LSN progress value when
there's no real data to be written. In that case the client would need
to rely on the server's confirmed_flush_lsn instead of its own stored
LSN progress value.

That's a reasonable thing for the *client* to do explicitly, e.g. by
just reading the slot's confirmed_flush_lsn and comparing to its own
stored lsn. But I don't think it's reasonable for the server to just
skip over data requested by the client because it thinks it knows best.

> I think because there is no need to process the WAL that has been
> confirmed by the client. Do you see any problems with this scheme?

Several:

* Replication setups are complex, and it can be easy to misconfigure
something or have a bug in some control code. An error is valuable to
detect the problem closer to the source.

* There are plausible configurations where things could go badly wrong.
For instance, if you are storing the decoded data in another postgres
server with syncrhonous_commit=off, and acknowledging LSNs before they
are durable. A crash of the destination system would be consistent, but
it would be missing some data earlier than the confirmed_flush_lsn. The
client would then request the data starting at its stored lsn progress
value, but the server would skip ahead to the confirmed_flush_lsn;
silently missing data.

* It's contradicted by the docs: "Instructs server to start streaming
WAL for logical replication, starting at WAL location XXX/XXX."

* The comment acknowledges that a user might expect an error in that
case; but doesn't really address why the user would expect an error,
and why it's OK to violate that expectation.

Regards,
Jeff Davis

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-06-11 06:23:41 Re: Error on pgbench logs
Previous Message Kyotaro Horiguchi 2021-06-11 06:14:59 Re: Race condition in recovery?