I attempted in two ways to resolve global deadlock situation
in the PostgresForest development.
(1) Use the lock_timeout to avoid from a global deadlock.
The lock_timeout feature is a very simple way to avoid
from the global deadlock situation.
I disagree "statement_timeout is the way to avoid global
deadlocks" too, because the statement_timeout kills
the healthy/long-running transaction by its timeout.
Some developers (including me!) proposed the lock_timeout
I still believe the "lock timeout" feature could help
resolving a global deadlock in the cluster environment.
(2) Use the global wait-for graph to detect a global deadlock.
I had an experimental implemetation to use the global wait-for
graph to prevent the global deadlock.
I used the node(server) identifiers and the pg_locks information
to build the global wait-for graph, and the kill signal
(or pg_cancel()?) to abort a victim transaction causing
I don't think the callback function is needed to replace
the current deadlock resolution feature,
but I agree we need a consensus how we could avoid
the global deadlock situation in the cluster.
On 2010/02/06 18:13, Markus Wanner wrote:
> I'd like to start a thread for discussion of the second item on the
> ClusterFeatures  list: Global Deadlock Information.
> IIRC there are two aspects to this item: a) the plain notification of a
> deadlock and b) some way to control or intercept deadlock resolution.
> The problem this item seems to address is the potential for deadlocks
> between transactions on different nodes. Or put another way: between a
> local transaction and one that's to be applied from a remote node (or
> even between two remote ones - similar issue, though). To ensure
> congruency between nodes, they must take the same measures to resolve
> the deadlock, i.e. abort the same transaction(s).
> I certainly disagree with the statement on the wiki that the
> "statement_timeout is the way to avoid global deadlocks", because I
> don't want to have to wait that long until a deadlock gets resolved.
> Further it doesn't even guarantee congruency, depending on the
> implementation of your clustering solution.
> I fail to see how a plain notification API would help much. After all,
> this could result in one node notifying having aborted transaction A to
> resolve a deadlock while another node notifies having aborted
> transaction B. You'd end up having to abort two (or more) transaction
> instead of just one to resolve a conflict.
> It could get more useful, if enabling such a notification would turn off
> the existing deadlock resolver and leave the resolution of the deadlock
> to the clustering solution. I'd call that an interception.
> Such an interception API should IMO provide a way to register a
> callback, which replaces the current deadlock resolver. Upon detection
> of a deadlock, the callback should get a list of transaction ids that
> are part of the lock cycle. It's then up to that callback, to chose one
> and abort that to resolve the conflict.
> And now, Greg's List:
> > 1) What feature does this help add from a user perspective?
> Preventing cluster-wide deadlocks (while maintaining congruency of
> > 2) Which replication projects would be expected to see an improvement
> > from this addition?
> I suspect all multi-master solutions are affected, certainly Postgres-R
> would benefit. Single-master ones certainly don't need it.
> > 3) What makes it difficult to implement?
> I don't see any real stumbling block. Deciding on an API needs consensus.
> > 4) Are there any other items on the list this depends on, or that it
> > is expected to have a significant positive/negative interaction with?
> Not that I know of.
> > 5) What replication projects include a feature like this already, or a
> > prototype of a similar one, that might be used as a proof of concept
> > or example implementation?
> Old Postgres-R versions once had such an interception, but it currently
> lacks a solution for this problem. I don't know of any other project
> that's already solved this.
> > 6) Who is already working on it/planning to work on it/needs it for
> > their related project?
> I'm not currently working on it and don't plan to do so (at least) until
> PgCon 2010.
> Cluster hackers, is this a good summary which covers your needs as well?
> Something missing?
> Markus Wanner
> : feature wish list of cluster hackers:
NAGAYASU Satoshi <satoshi(dot)nagayasu(at)gmail(dot)com>
In response to
pgsql-cluster-hackers by date
|Next:||From: Koichi Suzuki||Date: 2010-02-06 16:23:36|
|Subject: Re: Global Deadlock Information|
|Previous:||From: Markus Wanner||Date: 2010-02-06 09:13:02|
|Subject: Global Deadlock Information|