Re: Stopping logical replication protocol

From: Vladimir Gordiychuk <folyga(at)gmail(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Álvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Stopping logical replication protocol
Date: 2016-09-26 13:52:01
Message-ID: CAFgjRd1fa0Tizj+snG_pDtoo-GF73gsnVrA25TV82HYVQTqZOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>You should rely on time I tests as little as possible. Some of the test
hosts are VERY slow. If possible something deterministic should be used.

That's why this changes difficult to verify. Maybe in that case we should
write benchmark but not integration test?
In that case we can say, before this changes stopping logical replication
gets N ms but after apply changes it gets NN ms where NN ms less than N ms.
Is it available add this kind of benchmark to postgresql? I will be grateful
for links.

2016-09-26 5:32 GMT+03:00 Craig Ringer <craig(at)2ndquadrant(dot)com>:

> On 26 Sep. 2016 02:16, "Vladimir Gordiychuk" <folyga(at)gmail(dot)com> wrote:
> >
> > >It looks like you didn't import my updated patches, so I've
> rebased your new patches on top of them.
> > Yes, i forgot it, sorry. Thanks for you fixes.
> >
> > >I did't see you explain why this was removed:
> >
> > - /* fast path */
> > - /* Try to flush pending output to the client */
> > - if (pq_flush_if_writable() != 0)
> > - WalSndShutdown();
> > -
> > - if (!pq_is_send_pending())
> > - return;
> >
> > I remove it, because during decode long transaction code, we always
> exist from function by fast path and not receive messages from client. Now
> we always include in 'for' loop and executes
> > /* Check for input from the client */ ProcessRepliesIfAny();
>
> Makes sense.
> I
> > >Some of the changes to pg_recvlogical look to be copied from
> > >receivelog.c, most notably the functions ProcessKeepalive and
> > >ProcessXLogData .
> >
> > During refactoring huge function I also notice It, but decide not
> include additional changes in already huge patch. I only split method that
> do everything to few small functions.
>
> Good decision. This improves things already and makes future refactoring
> easier.
>
> > >I was evaluating the tests and I don't think they actually demonstrate
> > >that the patch works. There's nothing done to check that
> > >pg_recvlogical exited because of client-initiated CopyDone. While
> > >looking into that I found that it also never actually hits
> > >ProcessCopyDone(...) because libpq handles the CopyDone reply from the
> > >server its self and treats it as end-of-stream. So the loop in
> > >StreamLogicalLog will just end and ProcessCopyDone() is dead code.
> >
> > The main idea was about fast stop logical replication as it available,
> because if stop replication gets more them 1 seconds it takes more pain.
>
> You should rely on time I tests as little as possible. Some of the test
> hosts are VERY slow. If possible something deterministic should be used.
>
> > Increase this timeout to 3 or 5 second hide problems that this PR try
> fix, that's why I think this lines should be restore to state from previous
> patch.
>
> Per above I don't agree.
>
> >
> > ```
> > ok($spendTime <= 5, # allow extra time for slow machines
> > "pg_recvlogical exited promptly on signal when decoding");
> >
> > ok((time() - $cancelTime) <= 3, # allow extra time for slow machines
> > "pg_recvlogical exited promptly on sigint when idle"
> > );
> >
> > ```
> >
> > Also I do not understand why we do
> >
> > $proc->pump while $proc->pumpable;
> >
> > after send signal to process, why we can not just wait stop replication
> slot?
>
> Because verbose output can produce a lot of writes. If we don't pump the
> buffer pg_recvlogical could block writing on its socket. Also it lets us
> capture stderr which we need to observe that pg_recvlogical actually ended
> copy on user command rather than just receiving all the input available.
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-26 13:59:15 Re: Small race in pg_xlogdump --follow
Previous Message Magnus Hagander 2016-09-26 13:48:59 Re: Small race in pg_xlogdump --follow