Re: pg_replication_slot_advance to return NULL instead of 0/0 if slot not advanced

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_replication_slot_advance to return NULL instead of 0/0 if slot not advanced
Date: 2018-06-05 11:00:30
Message-ID: e02bd75a-025f-2484-b461-3889ae069704@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/06/18 06:28, Michael Paquier wrote:
> On Mon, Jun 04, 2018 at 11:51:35AM +0200, Petr Jelinek wrote:
>> On 01/06/18 21:13, Michael Paquier wrote:
>>> - startlsn =3D MyReplicationSlot->data.confirmed_flush;
>>> + if (OidIsValid(MyReplicationSlot->data.database))
>>> + startlsn =3D MyReplicationSlot->data.confirmed_flush;
>>> + else
>>> + startlsn =3D MyReplicationSlot->data.restart_lsn;
>>> +
>>> if (moveto < startlsn)
>>> {
>>> ReplicationSlotRelease();
>>
>> This part looks correct for the checking that we are not moving
>> backwards. However, there is another existing issue with this code
>> which
>> is that we are later using the confirmed_flush (via startlsn) as start
>> point of logical decoding (XLogReadRecord parameter in
>> pg_logical_replication_slot_advance) which is not correct. The
>> restart_lsn should be used for that. I think it would make sense to
>> fix
>> that as part of this patch as well.
>
> I am not sure I understand what you are coming at here. Could you
> explain your point a bit more please as 9c7d06d is yours?

As I said, it's an existing issue. I just had chance to reread the code
when looking and your patch.

> When creating
> the decoding context, all other code paths use the confirmed LSN as a
> start point if not explicitely defined by the caller of
> CreateDecodingContext, as it points to the last LSN where a transaction
> has been committed and decoded.

I didn't say anything about CreateDecodingContext though. I am talking
about the fact that we then use the same variable as input to
XLogReadRecord later in the logical slot code path. The whole point of
having restart_lsn for logical slot is to have correct start point for
XLogReadRecord so using the confirmed_flush there is wrong (and has been
wrong since the original commit).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Khandekar 2018-06-05 11:02:59 AtEOXact_ApplyLauncher() and subtransactions
Previous Message Amit Langote 2018-06-05 10:31:29 Re: why partition pruning doesn't work?