Re: Delay locking partitions during query execution

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Delay locking partitions during query execution
Date: 2019-03-06 01:03:54
Message-ID: CAKJS1f_Zo7KGyAX33q9V+jMGFP5RNr3m22tD2avrY0uTVFaXRw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 6 Mar 2019 at 04:46, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>
> On 3/5/19 6:55 AM, David Rowley wrote:
> > The only way I can think to fix this is to just never lock partitions
> > at all, and if a lock is to be obtained on a partition, it must be
> > instead obtained on the top-level partitioned table. That's a rather
> > large change that could have large consequences, and I'm not even sure
> > it's possible since we'd need to find the top-level parent before
> > obtaining the lock, by then the hierarchy might have changed and we'd
> > need to recheck, which seems like quite a lot of effort just to obtain
> > a lock... Apart from that, it's not this patch, so looks like I'll
> > need to withdraw this one :-(
> >
>
> So you're saying we could
>
> 1) lookup the parent and lock it
> 2) repeat the lookup to verify it did not change
>
> I think that could still be a win, assuming that most hierarchies will
> be rather shallow (I'd say 2-3 levels will cover like 95% of cases, and
> 4 levels would be 100% in practice). And the second lookup should be
> fairly cheap thanks to syscache and the fact that the hierarchies do not
> change very often.

Actually, I'm not sure it could work at all. It does not seem very
safe to lookup a partition's parent without actually holding a lock on
the partition and we can't lock the partition and then lock each
parent in turn as that's the exact opposite locking order that we do
when querying a partitioned table, so could result in deadlocking.

Many moons ago in the 0002 patch in [1] I proposed to change the way
we store partition hierarchies which involved ditching bool
pg_class.relispartition in favour of Oid pg_class.relpartitionparent.
That would make parent lookups pretty fast, but... given the above it
seems like we couldn't do this at all.

[1] https://www.postgresql.org/message-id/CAKJS1f9QjUwQrio20Pi%3DyCHmnouf4z3SfN8sqXaAcwREG6k0zQ%40mail.gmail.com

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2019-03-06 01:09:35 Re: patch to allow disable of WAL recycling
Previous Message Alvaro Herrera 2019-03-06 01:03:38 Re: [HACKERS] CLUSTER command progress monitor