Inconsistent error handling in START_REPLICATION command

From: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
To: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Inconsistent error handling in START_REPLICATION command
Date: 2016-01-05 09:39:49
Message-ID: CACACo5QVAEOo0n6HFkuX_0_RO6oDLEbg9qk1CREyxz0i-zuVDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hackers,

It looks like there's an inconsistency in error handling during
START_REPLICATION command of replication protocol:

$ psql postgres://localhost/psycopg2test?replication=database
psql (9.6devel)
Type "help" for help.

psycopg2test=# IDENTIFY_SYSTEM;
systemid | timeline | xlogpos | dbname
---------------------+----------+-----------+--------------
6235978519197579707 | 1 | 0/2CE0F78 | psycopg2test
(1 row)

psycopg2test=# START_REPLICATION SLOT "TEST1" LOGICAL 0/0 ("invalid"
"value");
ERROR: syntax error

1) syntax errors are reported and client can retry with corrected command:

psycopg2test=# START_REPLICATION SLOT "TEST1" LOGICAL 0/0 ("invalid"
'value');
ERROR: replication slot name "TEST1" contains invalid character
HINT: Replication slot names may only contain lower case letters, numbers,
and the underscore character.

2) further errors are reported and we can retry:

psycopg2test=# START_REPLICATION SLOT "test1" LOGICAL 0/0 ("invalid"
'value');
ERROR: replication slot "test1" does not exist

psycopg2test=# CREATE_REPLICATION_SLOT "test1" LOGICAL "test_decoding";
slot_name | consistent_point | snapshot_name | output_plugin
-----------+------------------+---------------+---------------
test1 | 0/2CF5340 | 0000088C-1 | test_decoding
(1 row)

psycopg2test=# START_REPLICATION SLOT "test1" LOGICAL 0/0 ("invalid"
'value');
unexpected PQresultStatus: 8

The last command results in the following output sent to the server log:

ERROR: option "invalid" = "value" is unknown
CONTEXT: slot "test1", output plugin "test_decoding", in the startup
callback

But the client has no way to learn about the error, nor can it restart with
correct one (the server has entered COPY protocol mode):

psycopg2test=# START_REPLICATION SLOT "test1" LOGICAL 0/0;
PQexec not allowed during COPY BOTH

I recall Craig Rigner mentioning this issue in context of the
pglogical_output plugin, but I thought that was something to do with the
startup packet. The behavior above doesn't strike me as very consistent,
we should be able to detect and report errors during output plugin startup
and let the client retry with the corrected command as we do for syntax or
other errors.

I didn't look in the code yet, but if someone knows off top of the head the
reason to this behavior, I'd be glad to learn it.

Cheers.
--
*Oleksandr "Alex" Shulgin*
*Database Engineer*

Mobile: +49 160 84-90-639
Email: oleksandr(dot)shulgin(at)zalando(dot)de

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2016-01-05 10:16:37 Re: Building pg_xlogdump reproducibly
Previous Message Tomas Vondra 2016-01-05 09:38:29 Re: GIN data corruption bug(s) in 9.6devel