Re: Replication protocol doc fix

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Replication protocol doc fix
Date: 2021-06-17 16:42:30
Message-ID: CA+Tgmobfr=8a-UngZm4p5TqQQQ84mKrXm6LqRmjH70BMP3m8pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 16, 2021 at 5:15 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> * A normal command, where you know that you've sent everything that you
> will send. In this case, the client needs to send the Sync message in
> order to get the ReadyForQuery message.
>
> * A command that initiates CopyIn/CopyBoth, where you are going to send
> more data after the command. In this case, sending the Sync eagerly is
> wrong, and you can't pipeline more queries in the middle of
> CopyIn/CopyBoth mode. Instead, the client should send Sync after
> receiving an ErrorResponse, or after sending a CopyDone/CopyFail
> (right?).

Well, that's one view of it. I would argue that the protocol ought not
to be designed in such a way that the client has to guess what
response the server might send back. How is it supposed to know? If
the user says, hey, go run this via the extended query protocol, we
don't want libpq to have to try to parse the query text and figure out
whether it looks COPY-ish. That's expensive, hacky, and might create
cross-version compatibility hazards if, say, a new replication command
that uses the copy protocol is added. Nor do we want the user to have
to specify what it thinks the server is going to do. Right now, we
have this odd situation where the client indeed does not try to guess
what the server will do and always send Sync, but the server acts as
if the client is doing what you propose here - only sending the
CopyDone/CopyFail at the end of everything associated with the
command.

> One thing I don't fully understand is what would happen if the client
> issued the Sync as the *first* message in an extended-protocol series.

I don't think that will break anything, because I think you can send a
Sync message to try to reestablish protocol synchronization whenever
you want. But I don't think it will accomplish anything either,
because presumably you've already got protocol synchronization at the
beginning of the sequence. The tricky part is getting resynchronized
after you've done some stuff.

> I attached a doc patch that hopefully clarifies this point as well as
> the weirdness around CopyIn/CopyBoth and the extended protocol. I
> reorganized the sections, as well.

On a casual read-through this seems pretty reasonable, but it
essentially documents that libpq is doing the wrong thing by sending
Sync unconditionally. As I say above, I disagree with that from a
philosophical perspective. Then again, unless we're willing to
redefine the wire protocol, I don't have an alternative to offer.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2021-06-17 16:54:13 Re: unnesting multirange data types
Previous Message Mark Dilger 2021-06-17 15:56:49 Re: Fix for segfault in logical replication on master