From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: exactly what is COPY BOTH mode supposed to do in case of an error? |
Date: | 2013-04-27 19:12:18 |
Message-ID: | CA+TgmoY7Gwy0b2Cdc4rf_0_nF24roROMPdAmEiDsmZO+dcxW-w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, Apr 27, 2013 at 6:02 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 27 April 2013 03:22, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> It seems the backend and libpq don't agree. The backend makes no
>> special provision to wait for a CopyDone message if an error occurs
>> during copy-both. It simply sends an ErrorResponse and that's it.
>> libpq, on the other hand, treats either CopyDone or ErrorResponse as a
>> cue to transition to PGASYNC_COPY_IN (see pqGetCopyData3).
>
> Well spotted, and good detective work.
Thanks.
>> I'm attaching a patch which adopts the position that the backend is
>> right and libpq is wrong. The opposite approach is also possible, but
>> I haven't tried to implement it. Or maybe there's a third way which
>> is better still.
>
> I guess if we assume this only affects replication we could change it
> at either end, not sure about that.
>
> libpq updates are much harder to roll out, so it would be better to
> assume that it is correct and the backend is wrong if we want to
> backpatch the fix.
>
> Not sure if that is a lot of work?
My feeling is that it would be better not to back-patch this, but just
fix it in master. Given the present uses of COPY-BOTH mode, the
problems seem to be limited to bad error messages, so it's arguably
not a critical bug fix. Also, I think that no matter which way we fix
it, people who upgrade the master to a new point release, but not
pg_receivexlog, would in some unlikely cases actually experience a
regression in the quality of error messages. I would say we have to
live with that if the consequences were any worse than bad error
messages in the first place, but as far as I can tell they're not. If
someone can contrive a scenario where this causes outright breakage,
that would tip the balance for me, but I don't at present see such a
hazard.
On a practical level, the main thing I didn't like about trying to fix
the server was the same issue that Tom mentioned: we'd need code in
the server to track whether COPY-BOTH mode is active and skip client
messages until we hit a CopyDone or CopyFail message. And I suspect
that code would be somewhat fragile, because having sent an
ErrorResponse already, we'd have no straightforward way to report a
further error - we'd need to report follow-on errors via NOTICE or
FATAL. Now this is not a disaster, but it's not great, either,
because there's a lot of code (including, notably, palloc) which
assumes that it can throw an ERROR whenever it likes. And in this
case, it couldn't.
The second thing I didn't like about that approach was that it would
make COPY-BOTH quite asymmetrical with both COPY-OUT and COPY-IN.
That didn't seem like a great idea, either.
A further point is that the problems in the back branches are less
serious anyway, because the timeline-switching code is the only thing
that ever tries to exit COPY-BOTH mode without closing the connection,
and that's new in 9.3.
So for all those reasons, my vote is for a client-side, master-only fix.
...Robert
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-04-27 19:23:17 | Re: Remaining beta blockers |
Previous Message | Tom Lane | 2013-04-27 18:24:17 | Re: pg_ctl non-idempotent behavior change |