Re: ATTACH/DETACH PARTITION CONCURRENTLY

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Sergei Kornilov <sk(at)zsrv(dot)org>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY
Date: 2019-01-25 21:18:15
Message-ID: 201901252118.zzpangcuddxz@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-Jan-25, Robert Haas wrote:

> I finally gotten a little more time to work on this. It took me a
> while to understand that a PartitionedRelPruneInfos assumes that the
> indexes of partitions in the PartitionDesc don't change between
> planning and execution, because subplan_map[] and subplan_map[] are
> indexed by PartitionDesc offset.

Right, the planner/executor "disconnect" is one of the challenges, and
why I was trying to keep the old copy of the PartitionDesc around
instead of building updated ones as needed.

> I suppose the reason for this is so
> that we don't have to go to the expense of copying the partition
> bounds from the PartitionDesc into the final plan, but it somehow
> seems a little scary to me. Perhaps I am too easily frightened, but
> it's certainly a problem from the point of view of this project, which
> wants to let the PartitionDesc change concurrently.

Well, my definition of the problem started with the assumption that we
would keep the partition array indexes unchanged, so "change
concurrently" is what we needed to avoid. Yes, I realize that you've
opted to change that definition.

I may have forgotten some of your earlier emails on this, but one aspect
(possibly a key one) is that I'm not sure we really need to cope, other
than with an ERROR, with queries that continue to run across an
attach/detach -- moreso in absurd scenarios such as the ones you
described where the detached table is later re-attached, possibly to a
different partitioned table. I mean, if we can just detect the case and
raise an error, and this let us make it all work reasonably, that might
be better.

> I wrote a little patch that stores the relation OIDs of the partitions
> into the PartitionedPruneRelInfo and then, at execution time, does an
> Assert() that what it gets matches what existed at plan time. I
> figured that a good start would be to find a test case where this
> fails with concurrent DDL allowed, but I haven't so far succeeded in
> devising one. To make the Assert() fail, I need to come up with a
> case where concurrent DDL has caused the PartitionDesc to be rebuilt
> but without causing an update to the plan. If I use prepared queries
> inside of a transaction block, [...]

> I also had the idea of trying to use a cursor, because if I could
> start execution of a query, [...]

Those are the ways I thought of, and the reason for the shape of some of
those .spec tests. I wasn't able to hit the situation.

> Maybe if I try it with CLOBBER_CACHE_ALWAYS...

I didn't try this one.

> Anyway, I think this idea of passing a list of relation OIDs that we
> saw at planning time through to the executor and cross-checking them
> might have some legs. If we only allowed concurrently *adding*
> partitions and not concurrently *removing* them, then even if we find
> the case(s) where the PartitionDesc can change under us, we can
> probably just adjust subplan_map and subpart_map to compensate, since
> we can iterate through the old and new arrays of relation OIDs and
> just figure out which things have shifted to higher indexes in the
> PartitionDesc. This is all kind of hand-waving at the moment; tips
> appreciated.

I think detaching partitions concurrently is a necessary part of this
feature, so I would prefer not to go with a solution that works for
attaching partitions but not for detaching them. That said, I don't see
why it's impossible to adjust the partition maps in both cases. But I
don't have anything better than hand-waving ATM.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-25 22:22:36 Re: Early WIP/PoC for inlining CTEs
Previous Message David Fetter 2019-01-25 21:16:15 Re: crosstab/repivot...any interest?