Re: Stopping logical replication protocol

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Vladimir Gordiychuk <folyga(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Stopping logical replication protocol
Date: 2016-05-11 04:25:26
Message-ID: CAMsr+YHt=PMq-c+Xp-PaUCUL9KF-tjsy+gwhPOcD0tDBBjpd8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11 May 2016 at 06:47, Vladimir Gordiychuk <folyga(at)gmail(dot)com> wrote:

> Same thread, I just think these are two somewhat separate changes. One is
>> just in the walsender and allows return to command mode during waiting for
>> WAL. The other is more intrusive into the reorder buffer etc and allows
>> aborting decoding during commit processing. So two separate patches make
>> sense here IMO, one on top of the other.
>
>
> About the second part of the patch. What the reason decode and send whole
> transaction? Why we can't process logical decoding via WalSndLoop LSN by
> LSN as it work in physycal replication? For example if transaction contains
> in more them one LSN, first we decode and send "begin", "part data from
> current LSN" and then returns to WalSndLoop on the next iteration we send
> "another part data", "commit". I don't research in this way, because I
> think it will be big changes in comparison callback that stop sending.
>

There are two parts to that. First, why do we reorder at all, accumulating
a whole transaction in a reorder buffer until we see a commit then sending
it all at once? Second, when sending, why don't we return control to the
walsender between messages?

For the first: reordering xacts server-side lets the client not worry about
replay order. It just applies them as it receives them. It means the server
can omit uncommitted transactions from the stream entirely and clients can
be kept simple(r). IIRC there are also some issues around relcache
invalidation handling and time travel that make it desirable to wait until
commit before building a snapshot and decoding, but I haven't dug into the
details. Andres is the person who knows that area best.

As for why we don't return control to the walsender between change
callbacks when processing a reorder buffer at commit time, I'm not really
sure but suspect it's mostly down to easy API and development. If control
returned to the walsender between each change we'd need an async api for
the reorder buffer where you test to see if there's more unprocessed work
and call back into the reorder buffer again if there is. So the reorder
buffer has to keep state for the progress of replaying a commit in a
separate struct, handle repeated calls to process work, etc. Also, since
many individual changes are very small that could lead to a fair bit of
overhead; it'd probably want to process a batch of changes then return.
Which makes it even more complicated.

If it returned control to the caller between changes then each caller would
also need to have the logic to test for more work and call back into the
reorder buffer. Both the walsender and SQL interface would need it. The way
it is, the loop is just in one place.

It probably makes more sense to have a callback that can test state and
abort processing, like you've introduced. The callback could probably even
periodically check to see if there's client input to process and consume it.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-05-11 04:34:08 Re: parallel.c is not marked as test covered
Previous Message Peter Eisentraut 2016-05-11 04:11:40 Re: Does Type Have = Operator?