Re: Reviving the "Stopping logical replication protocol" patch from Vladimir Gordichuk

From: Dave Cramer <davecramer(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Vladimir Gordiychuk <folyga(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Reviving the "Stopping logical replication protocol" patch from Vladimir Gordichuk
Date: 2019-01-20 16:40:19
Message-ID: CADK3HHLVvu_BMWNUEeFYWZ1G_4ia3pOGjG4qw5kZZa3wqOJmhA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dave Cramer

On Tue, 15 Jan 2019 at 07:53, Dave Cramer <davecramer(at)gmail(dot)com> wrote:

>
> Dave Cramer
>
>
> On Sun, 13 Jan 2019 at 23:19, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
>
>> On Mon, 3 Dec 2018 at 19:38, Dave Cramer <davecramer(at)gmail(dot)com> wrote:
>>
>>> Dmitry,
>>>
>>> Please see attached rebased patches
>>>
>>
>> I'm fine with patch 0001, though I find this comment a bit hard to follow:
>>
>> + * The send_data callback must enqueue complete CopyData messages to
>> libpq
>> + * using pq_putmessage_noblock or similar, since the walsender loop may
>> send
>> + * CopyDone then exit and return to command mode in response to a client
>> + * CopyDone between calls to send_data.
>>
>>
> I think it needs splitting up into a couple of sentences.
>>
>> Fair point, remember it was originally written by a non-english speaker
>

Thoughts on below ?

+/*
+ * Main loop of walsender process that streams the WAL over Copy messages.
+ *
+ * The send_data callback must enqueue complete CopyData messages to the
client
+ * using pq_putmessage_noblock or similar
+ * In order to preserve the protocol it is necessary to send all of the
existing
+ * messages still in the buffer as the WalSender loop may send
+ * CopyDone then exit and return to command mode in response to a client
+ * CopyDone between calls to send_data.
+ */

>
>> In patch 0002, stopping during a txn. It's important that once we skip
>> any action, we continue skipping. In patch 0002 I'd like it to be clearer
>> that we will *always* skip the rb->commit callback
>> if rb->continue_decoding_cb() returned false and aborted the while loop. I
>> suggest storing the result of (rb->continue_decoding_cb == NULL ||
>> rb->continue_decoding_cb()) in an assignment within the while loop, and
>> testing that result later.
>>
>> e.g.
>>
>> (continue_decoding = (rb->continue_decoding_cb == NULL ||
>> rb->continue_decoding_cb()))
>>
>> and later
>>
>> if (continue_decoding) {
>> rb->commit(rb, txn, commit_lsn);
>> }
>>
>> Will do
>

Hmmm... I don't actually see how this is any different than what we have in
the patch now where below would the test occur?

> I don't actually think it's necessary to re-test the continue_decoding
>> callback and skip commit here. If we've sent all the of the txn
>> except the commit, we might as well send the commit, it's tiny and might
>> save some hassle later.
>>
>>
>> I think a comment on the skipped commit would be good too:
>>
>> /*
>> * If we skipped any changes due to a client CopyDone we must not send a
>> commit
>> * otherwise the client would incorrectly think it received a complete
>> transaction.
>> */
>>
>> I notice that the fast-path logic in WalSndWriteData is removed by this
>> patch. It isn't moved; there's no comparison of last_reply_timestamp
>> and wal_sender_timeout now. What's the rationale there? You want to ensure
>> that we reach ProcessRepliesIfAny() ? Can we cheaply test for a readable
>> client socket then still take the fast-path if it's not readable?
>>
>
> This may have been a mistake as I am taking this over from a very old
> patch that I did not originally write. I'll have a look at this
>

OK, I'm trying to decipher the original intent of this patch as well as I
didn't write it.

There are some hints here
https://www.postgresql.org/message-id/CAFgjRd1LgVbtH%3D9O9_xvKQjvUP7aRF-edxqwKfaNs9hMFW_4gw%40mail.gmail.com

As to why the fast path logic was removed. Does it make sense to you?

Dave

>
>>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-01-20 16:55:41 Re: PostgreSQL vs SQL/XML Standards
Previous Message Tomas Vondra 2019-01-20 16:19:55 Re: could not access status of transaction