Re: Inconsistent error handling in START_REPLICATION command

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inconsistent error handling in START_REPLICATION command
Date: 2016-01-20 07:28:20
Message-ID: CAMsr+YEb19wo0HvN0FXnN9FYxMaEUL7x-VtuDvj1Uu3nK5n8nQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5 January 2016 at 18:35, Shulgin, Oleksandr <oleksandr(dot)shulgin(at)zalando(dot)de
> wrote:

> psycopg2test=# START_REPLICATION SLOT "test1" LOGICAL 0/0 ("invalid"
>> 'value');
>> unexpected PQresultStatus: 8
>>
>
I think the point here is that START_REPLICATION issues useful errors and
returns to the normal protocol up until the logical decoding startup
callback is invoked.

It enters COPY BOTH mode before it invokes the startup callback. The client
has no way to unilaterally terminate COPY BOTH mode and return to the
normal walsender protocol. The server doesn't do it when there's an ERROR.
So the only option the client has for recovery is to disconnect and
reconnect.

> I recall Craig Rigner mentioning this issue in context of the
>> pglogical_output plugin, but I thought that was something to do with the
>> startup packet. The behavior above doesn't strike me as very consistent,
>> we should be able to detect and report errors during output plugin startup
>> and let the client retry with the corrected command as we do for syntax or
>> other errors.
>>
>
I agree. This would also let clients have some limited facility for
negotiating options with the output plugin, albeit via the very clumsy
channel of libpq protocol error message records and their text.

> Looks like the attached patch fixes it for me:
>>
>
For those who didn't read it, the patch moves the start of COPY BOTH mode
to after the logical decoding startup callback is invoked.

This certainly fixes the immediate issue. It also forecloses something else
I'd really like though, which is the ability for plugins to send output to
clients from within their decoding startup hook.

For that reason I'd actually like to enter COPY BOTH mode before the
startup callback, as is currently the case. So I'd like to wrap the
decoding startup callback in a PG_TRY that catches an ERROR raised by the
startup callback (if any) and exits COPY BOTH mode before re-throwing.

Sound reasonable?

I'm looking toward a future where the decoding startup callback can emit a
structured, plugin-specific CopyData-wrapped message with information the
downstream might need to negotiate a connection. It could then exit back to
walsender mode by throwing an ERROR. Or, better, by setting some new
decoding context field that's checked after each callback to see if the
plugin wants to exit COPY BOTH mode and return to walsender protocol mode
without spamming the logs with an ERROR. That way we can finally do two-way
negotiation between downstream and upstream.

(Related to this, I also want to add a new decoding callback that lets the
downstream send arbitrary CopyData back up the wire to the upstream where
it's passed to the decoding callback. This would give the downstream a
backchannel to change settings, request things from the output plugin, etc,
w/o lots of disconnects and reconnects.)

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-01-20 07:34:59 Re: Logical decoding on standby
Previous Message Craig Ringer 2016-01-20 07:11:06 Re: Logical decoding on standby