Re: Stopping logical replication protocol

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Vladimir Gordiychuk <folyga(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Stopping logical replication protocol
Date: 2016-09-26 02:32:18
Message-ID: CAMsr+YHygZgCtpLz5EgCYNsJUjTWkgB7TS7-cJaNx1_tfVBR6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 Sep. 2016 02:16, "Vladimir Gordiychuk" <folyga(at)gmail(dot)com> wrote:
>
> >It looks like you didn't import my updated patches, so I've rebased your
new patches on top of them.
> Yes, i forgot it, sorry. Thanks for you fixes.
>
> >I did't see you explain why this was removed:
>
> - /* fast path */
> - /* Try to flush pending output to the client */
> - if (pq_flush_if_writable() != 0)
> - WalSndShutdown();
> -
> - if (!pq_is_send_pending())
> - return;
>
> I remove it, because during decode long transaction code, we always exist
from function by fast path and not receive messages from client. Now we
always include in 'for' loop and executes
> /* Check for input from the client */ ProcessRepliesIfAny();

Makes sense.
I
> >Some of the changes to pg_recvlogical look to be copied from
> >receivelog.c, most notably the functions ProcessKeepalive and
> >ProcessXLogData .
>
> During refactoring huge function I also notice It, but decide not include
additional changes in already huge patch. I only split method that do
everything to few small functions.

Good decision. This improves things already and makes future refactoring
easier.

> >I was evaluating the tests and I don't think they actually demonstrate
> >that the patch works. There's nothing done to check that
> >pg_recvlogical exited because of client-initiated CopyDone. While
> >looking into that I found that it also never actually hits
> >ProcessCopyDone(...) because libpq handles the CopyDone reply from the
> >server its self and treats it as end-of-stream. So the loop in
> >StreamLogicalLog will just end and ProcessCopyDone() is dead code.
>
> The main idea was about fast stop logical replication as it available,
because if stop replication gets more them 1 seconds it takes more pain.

You should rely on time I tests as little as possible. Some of the test
hosts are VERY slow. If possible something deterministic should be used.

> Increase this timeout to 3 or 5 second hide problems that this PR try
fix, that's why I think this lines should be restore to state from previous
patch.

Per above I don't agree.

>
> ```
> ok($spendTime <= 5, # allow extra time for slow machines
> "pg_recvlogical exited promptly on signal when decoding");
>
> ok((time() - $cancelTime) <= 3, # allow extra time for slow machines
> "pg_recvlogical exited promptly on sigint when idle"
> );
>
> ```
>
> Also I do not understand why we do
>
> $proc->pump while $proc->pumpable;
>
> after send signal to process, why we can not just wait stop replication
slot?

Because verbose output can produce a lot of writes. If we don't pump the
buffer pg_recvlogical could block writing on its socket. Also it lets us
capture stderr which we need to observe that pg_recvlogical actually ended
copy on user command rather than just receiving all the input available.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tsunakawa, Takayuki 2016-09-26 02:37:15 Re: [RFC] Change the default of update_process_title to off
Previous Message Michael Paquier 2016-09-26 02:18:54 Re: pgsql: pg_ctl: Detect current standby state from pg_control