Re: Hot Standy introduced problem with query cancel behavior

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Joachim Wieland <joe(at)mcknight(dot)de>, Kris Jurka <books(at)ejurka(dot)com>
Subject: Re: Hot Standy introduced problem with query cancel behavior
Date: 2010-01-12 05:30:17
Message-ID: 4B4C08E9.1070804@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/07/10 22:37, Andres Freund wrote:
> On Thursday 07 January 2010 22:28:46 Tom Lane wrote:
>> Andres Freund<andres(at)anarazel(dot)de> writes:
>>> I did not want to suggest using Simons code there. Sorry for the brevity.
>>> should have read as "revert to old code and add two step killing (thats
>>> likely around 10 lines or so)".
>>>
>>> "two step killing" meaning that we signal ERROR for a few times and if
>>> nothing happens that we like, we signal FATAL.
>>> As the code already loops around signaling anyway that should be easy to
>>> implement.
>> Ah. This loop happens in the process that's trying to send the cancel
>> signal, correct, not the one that needs to respond to it? That sounds
>> fairly sane to me.
> Yes.
>
>
>> There are some things we could do to make it more likely that a cancel
>> of this type is accepted --- for instance, give it a distinct SQLSTATE
>> code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
>> is no practical way to guarantee it except elog(FATAL). I'm not
>> entirely convinced that an untrappable error would be a good thing
>> anyway; it's hard to argue that that's much better than a FATAL.
> Well a session which is usable after a transaction abort is quite sensible -
> quite some software I know handles a failing transaction much more gracefully
> than a session abort (e.g. because it has to deal with serialization failures
> and such anyway).
>
> So making it cought by fewer places and degrading to FATAL sound sensible and
> relatively easy to me.
> Unless somebody disagrees I will give it a shot.
Ok, here is a stab at that:

1. Allow the signal multiplexing facility to transfer one sig_atomic_t
worth of data. This is usefull e.g. for making operations idempotent
or more precise.

In this the LocalBackendId is transported - that should make it
impossible to cancel the wrong transaction

Use the signal multiplexing facility.

2.

AbortTransactionAndAnySubtransactions is only used in the mainloops
error handler as it should be unproblematic there.

In the current CVS code ConditionalVirtualXactLockTableWait() in
ResolveRecoveryConflictWithVirtualXIDs does the wait for every try of
cancelling the other transaction.

I moved the retry logic into CancelVirtualTransaction(). If 50 times a
ERROR does nothing it degrades to FATAL

XXX: I temporarily do not use the skipExistingConflicts argument of
GetConflictingVirtualXIDs - I dont understand its purpose and a bit of
infrastructure is missing right now as the recoveryConflictMode is not
stored anymore anywhere. Can easily be added back though.

3.
Add a new error code ERRCODE_QUERY_CANCELED_HS for use with HS
indicating a failure that is more than a plain
ERRCODE_QUERY_CANCELED - namely it should not be caught from
various places like savepoints and in PLs.

Exemplarily I checked for that error code in plpgsql and make it
uncatcheable.

I am not sure how new errorcode codes get chosen though - and the name
is not that nice.

Opinions on that?

I copied quite a bit from Simons earlier patch...

---

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Andres

PS: My current MUA suffers from a wronggone upgrade currently, so no
idea how that message will appear.

Attachment Content-Type Size
hs-query-cancel.patch text/x-diff 26.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2010-01-12 05:40:02 Re: Hot Standy introduced problem with query cancel behavior
Previous Message Fujii Masao 2010-01-12 05:16:23 Re: Streaming replication status