Re: walsender termination error messages worse in v10

From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Peter Eisentraut <peter_e(at)gmx(dot)net>
Subject: Re: walsender termination error messages worse in v10
Date: 2017-06-08 21:57:19
Message-ID: 20170608215719.loangt3x4kimfh25@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-06-03 00:55:22 +0200, Petr Jelinek wrote:
> On 02/06/17 23:45, Andres Freund wrote:
> > Hi Petr,
> >
> > On 2017-06-02 22:57:37 +0200, Petr Jelinek wrote:
> >> On 02/06/17 20:51, Andres Freund wrote:
> >>> I don't understand why the new block is there, nor does the commit
> >>> message explain it.
> >>>
> >>
> >> Hmm, that particular change can actually be reverted. It was needed for
> >> one those custom replication commands which were replaced by normal
> >> query support. I have missed it during the rewrite.
> >
> > Doesn't appear to be quite that simple, I get regression test failures
> > in that case.
> >
>
> Hmm, looks like we still use it for normal COPY handling. So basically
> the problem is that if we run COPY TO STDOUT and then consume it using
> the libpqrcv_receive it will end with normal PGRES_COMMAND_OK but we
> need to call PQgetResult() in that case otherwise libpq thinks the
> command is still active and any following command will fail, but if we
> call PQgetResult on dead connection we get that error you complained about.

Should this possibly handled at the caller level? This is a bit too
much magic for my taste.

Looking at the callers, the new code isn't super-obvious either:

len = walrcv_receive(wrconn, &buf, &fd);

if (len != 0)
{
/* Process the data */
for (;;)
{
CHECK_FOR_INTERRUPTS();

if (len == 0)
{
break;
}
else if (len < 0)
{
ereport(LOG,
(errmsg("data stream from publisher has ended")));
endofstream = true;
break;
}

The len < 0, hidden inside a len != 0, which in the loop again chcks if
len == 0 (because it's decremented in loop)? And there's two different[5~
len = walrcv_receive(wrconn, &buf, &fd);
statements?

> I guess it would make sense to do conditional exit on
> (PQstatus(streamConn) == CONNECTION_BAD) like libpqrcv_PQexec does. It's
> quite ugly code-wise though.

I think that's fine for now. It'd imo be a good idea to improve matters
here a bit, but for now I've just applied your patch.

- Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2017-06-08 22:08:04 Re: PG10 transition tables, wCTEs and multiple operations on the same table
Previous Message Tom Lane 2017-06-08 21:18:05 Re: PG10 transition tables, wCTEs and multiple operations on the same table