Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress

From: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress
Date: 2016-04-20 07:16:40
Message-ID: CAHGQGwHvzV2J0QodA8x1xCx3CbaBmJTveQeoLFzX8hq5G25jEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 31, 2016 at 9:15 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> Hi hackers,
>
> If you shut down a primary server, a standby that is streaming from it says54:
>
> LOG: replication terminated by primary server
> DETAIL: End of WAL reached on timeline 1 at 0/14F4B68.
> FATAL: could not send end-of-streaming message to primary: no COPY in progress
>
> Isn't that FATAL ereport a bug?

ISTM that the cause is that walsender exits and replication connection is
closed just after "COPY 0" is sent. That is, then after receiving "COPY 0",
walreceiver tries to send an end-of-copy message to the primary, but fails
because the connection has been already closed.

> How is clean server shutdown supposed to work?

One option is to make walsender wait for end-of-copy message from walreceiver
before it closes the connection and exits, after sending "COPY 0" message.
But one question is; how should walsender behave when walreceiver gets stuck
and cannot reply an end-of-copy message to walsender? Probably we need
the timeout (maybe we can use wal_sender_timeout here but not sure yet
if it's appropriate or not).

Another option is to prevent walreceiver from sending an end-of-copy message.
If "COPY 0" always means the exit of walsender and the termination of
the connection, there seems to be no need to send back an end-of-copy message.
I've not checked yet how this interferes with other replication logics, though.

Regards,

--
Fujii Masao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-04-20 07:20:43 Re: Description of ForeignPath
Previous Message Kyotaro HORIGUCHI 2016-04-20 07:16:37 Re: Support for N synchronous standby servers - take 2