Re: PATCH: 9.5 replication origins fix for logical decoding

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Pallavi Sontakke <pallavi(dot)sontakke(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Subject: Re: PATCH: 9.5 replication origins fix for logical decoding
Date: 2015-10-16 03:51:24
Message-ID: CAMsr+YFuS5hrAZRQmZJM9+uuy5mZvsTNL+4twPALw3CjNp_itQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 15 October 2015 at 20:55, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2015-10-15 20:52:41 +0800, Craig Ringer wrote:
>> You'll note that the tests fail. When the replication origin is reset
>> and set again with pg_replication_origin_xact_setup mid-xact, the
>> origin identity exposed to the decoding plugin callbacks for all
>> records (including those created before the origin change) is the
>> latter origin, the one active at COMMIT time.
>>
>> Is that the intended behaviour? That the session identifier is
>> determined by what was active at commit time, and only the lsn and
>> timestamp vary throughout the xact? It looks like it from the code.
>
> Uh. Isn't that just because you looked at txn->origin_id instead of the
> change's origin_id?

Yes, it is. I didn't realise that the individual changes had their own
origins, rather than changing the origin in the txn, though I can see
that now that I know to look.

Either I'm confused (likely) or the concept behind allowing this is
critically flawed.

Say some client code does

set session origin=1
begin
set xact lsn=0/123, ts=13:00
do some inserts
set session origin=2
set xact lsn=0/199, ts=14:00
do some more inserts
commit

it seems to be decoded as:

begin origin=2 lsn=0/199
inserts origin=1 lsn=0/199
more inserts origin=2 lsn=0/199
commit origin=2 lsn=0/199

i.e. the begin and commit have the final session origin. Individual
changes have the session origin in effect at the time the change was
created. The last-set origin commit timestamp and origin lsn override
all prior ones; they aren't recorded per-change, only on the commit.

This means you have change records with a change->origin_id that's
from a completely different node, which makes no sense at all with the
txn->origin_lsn . It matches the txn->origin_id, which is the same
throughout, but then why even have the change->origin_id?

I find the idea of each change having its own origin node - but not
its own origin LSN - very confusing. For one thing the origin filter
callback can't know about that, and can only filter based on the txn's
origin. I guess that's the output plugin's problem - if it wants to
cope with arbitrary mixed-origin tx's it can't use the origin filter
and has to check each message.

I really don't see how it makes any sense to allow the origin_id to
change mid-tx. I can see how sending the origin_id for each change
could make sense to allow future support for transaction streaming
where decoding starts before we receive the commit record, but
changing the origin_id within the tx doesn't make any sense.

IMO changing the origin should be disallowed within a tx. Otherwise
there needs to be some way to record the origin lsn and commit
timestamp changing within the tx too.

I was going to just send a patch to disallow changing the origin
mid-tx, but I'm not sure I see a good way to do that since the origin
is a session-level global, not part of the xact info.

Document it as a "don't do that, if you do it you get to keep the pieces"?

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-10-16 03:51:40 Re: plpython is broken for recursive use
Previous Message Haribabu Kommi 2015-10-16 03:38:44 Re: Parallel Seq Scan