Re: group locking: incomplete patch, just for discussion

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: group locking: incomplete patch, just for discussion
Date: 2014-11-23 23:59:24
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On 2014-11-21 08:31:01 -0500, Robert Haas wrote:
> On Thu, Nov 20, 2014 at 8:31 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> >> I can't follow that logic. Why do you think self-exclusive locks are
> >> the only ones that matter?
> >
> > All the others can be acquired by jumping in front of the lock's
> > waitqueue?
> That's is true, as a practical matter, in many cases. But from the
> point of view of the lock manager, it's really not true. It is almost
> true if you assume that the lock level acquired by the parallel
> backend will be weaker than the lock level held by the user backed,
> but not quite, because RowExclusiveLock conflicts with ShareLock.

The assumption that a parallel backend will accquire a locklevel <= the
worker's lock imo is a pretty sane one - we obviously can't predict
things if not. And I don't think any of the other presented approaches
can do that.

I'm not really bothered by RowExclusiveLock vs. ShareLock. Unless I miss
something that can only be problematic if the primary acquires a
ShareLock and the worker tries a RowExclusive. That just can't be sane.

> The assumption that the parallel backend's lock level will always be
> weaker seems shaky. It's probably true for relation locks, but if we
> ever want to support parallel workers that can write any data
> whatsoever, they're going to need to be able to take relation
> extension locks.

I don't think extension locks are a good argument here. As you noted
upthread, they really need to be used consistenly across processes. And
they shouldn't/wouldn't be granted by your suggested mechanism either,

> If one backend in a parallel group attempts to
> acquire the relation extension lock while another worker holds it, the
> rule that all group members should be regarded as waiting for each
> other will result in instantaneous deadlock. That's pretty
> undesirable, and suggests to me that the whole model is defective in
> some way.

Shouldn't the deadlock checker actually check that all involved
processes are actually waiting for a lock? It seems a bit pointless to
deadlock when one node is actually progressing. That seems like it'd be
an absolute PITA for anything involving system tables and such.

> >> > I don't buy the plan time/execution time argument. We'll need to be able
> >> > to adapt plans to the availability of bgworker slots *anyway*. Otherwise
> >> > we either need to to set the number of bgworkers to "MaxConnections *
> >> > MaxParallelism" or live with errors as soon as too many processes use
> >> > parallelism. The former is clearly unrealistic given PG's per-backend
> >> > overhead and the latter will be a *much* bigger PITA than all the
> >> > deadlock scenarios talked about here. It'll constantly fail, even if
> >> > everything is ok.
> >>
> >> I certainly think that any parallel node needs to be able to cope with
> >> getting fewer workers than it would ideally like to have. But that's
> >> different from saying that when our own backend takes new locks, we
> >> have to invalidate plans. Those seem like largely unrelated problems.
> >
> > No need to invalidate them if you have the infrastructure to run with no
> > parallel workers - just use the path that's used if there's no workers
> > available.
> Well, it's OK for a query to run with less paralellism than the
> planner thought optimal. It's not OK for it to attempt to run with
> the requested degree of parallelism and deadlock. I think if the
> deadlock detector gets involved it's quite unlikely that you're going
> to be able to make things degrade smoothly to non-parallel operation.

That was in response to your argument about plan invalidation - which I
don't buy because it'd only reuse the graceful degradiation capabilities
we need anyway.

> >> The heavyweight locking issue is really killing me, though.
> >
> > I don't really understand why you're not content with just detecting
> > deadlocks for now. Everything else seems like bells and whistles for
> > later.
> I don't think it's OK for a parallel query to just randomly deadlock.

I think that should be the end goal, yes.

> It's OK for parallel query to excuse itself politely and retire from
> the field, but I don't think it's OK for it to say, oh, sure, I can
> parallelize that for you, and then fail by deadlocking with itself.

But I also think that right now it doesn't necessarily need to be the
focus immediately. This is a topic that I think will be much easier to
approach if some demonstration of actual parallel users would be
available. Doesn't have to be committable or such, but I think it'll
help to shape how this has to look. I think unrecognized deadlocks will
make things too annoying to use, but I don't think it needs to be
guaranteed deadlock free or such.


Andres Freund

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip kumar 2014-11-24 02:04:09 Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Previous Message Tom Lane 2014-11-23 21:40:30 Re: proposal: plpgsql - Assert statement