Re: ATTACH/DETACH PARTITION CONCURRENTLY

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY
Date: 2018-08-09 13:55:58
Message-ID: CAKJS1f9Kw-_Km_+2Bub3cHTXoLt=timDmWwrRT6_MOU4Xm519w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 August 2018 at 01:29, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2018-08-08 01:23:51 +1200, David Rowley wrote:
>> I'm not proposing that sessions running older snapshots can't see that
>> there's a new partition. The code I have uses PartitionIsValid() to
>> test if the partition should be visible to the snapshot. The
>> PartitionDesc will always contain details for all partitions stored in
>> pg_partition whether they're valid to the current snapshot or not. I
>> did it this way as there's no way to invalidate the relcache based on
>> a point in transaction, only a point in time.
>
> I don't think that solves the problem that an arriving relcache
> invalidation would trigger a rebuild of rd_partdesc, while it actually
> is referenced by running code.
>
> You'd need to build infrastructure to prevent that.
>
> One approach would be to make sure that everything relying on
> rt_partdesc staying the same stores its value in a local variable, and
> then *not* free the old version of rt_partdesc (etc) when the refcount >
> 0, but delay that to the RelationClose() that makes refcount reach
> 0. That'd be the start of a framework for more such concurrenct
> handling.

I'm not so sure not freeing the partdesc until the refcount reaches 0
is safe. As you'd expect, we hold a lock on a partitioned table
between the planner and executor, but the relation has been closed and
the ref count returns to 0, which means when the relation is first
opened in the executor that the updated PartitionDesc is obtained. A
non-concurrent attach would have been blocked in this case due to the
lock being held by the planner. Instead of using refcount == 0,
perhaps we can release the original partdesc only when there are no
locks held by us on the relation.

It's late here now, so I'll look at that tomorrow.

I've attached what I was playing around with. I think I'll also need
to change RelationGetPartitionDesc() to have it return the original
partdesc, if it's non-NULL.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
dont_destroy_original_partdesc_on_rel_inval.patch application/octet-stream 3.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-09 14:03:43 Re: remove ancient pre-dlopen dynloader code
Previous Message Andrew Dunstan 2018-08-09 13:31:44 Re: buildfarm: could not read block 3 in file "base/16384/2662": read only 0 of 8192 bytes