Re: ATTACH/DETACH PARTITION CONCURRENTLY

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Sergei Kornilov <sk(at)zsrv(dot)org>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY
Date: 2019-01-31 23:00:56
Message-ID: 201901312300.ddqydnme2hdw@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-Jan-31, Robert Haas wrote:

> OK, that seems to be pretty easy. New patch series attached. The
> patch with that new logic is 0004. I've consolidated some of the
> things I had as separate patches in my last post and rewritten the
> commit messages to explain more clearly the purpose of each patch.

Looks awesome.

> - For now, I haven't tried to handle the DETACH PARTITION case. I
> don't think there's anything preventing someone - possibly even me -
> from implementing the counter-based approach that I described in the
> previous message, but I think it would be good to have some more
> discussion first on whether it's acceptable to make concurrent queries
> error out. I think any queries that were already up and running would
> be fine, but any that were planned before the DETACH and tried to
> execute afterwards would get an ERROR. That's fairly low-probability,
> because normally the shared invalidation machinery would cause
> replanning, but there's a race condition, so we'd have to document
> something like: if you use this feature, it'll probably just work, but
> you might get some funny errors in other sessions if you're unlucky.
> That kinda sucks but maybe we should just suck it up. Possibly we
> should consider making the concurrent behavior optional, so that if
> you'd rather take blocking locks than risk errors, you have that
> option. Of course I guess you could also just let people do an
> explicit LOCK TABLE if that's what they want. Or we could try to
> actually make it work in that case, I guess by ignoring the detached
> partitions, but that seems a lot harder.

I think telling people to do LOCK TABLE beforehand if they care about
errors is sufficient. On the other hand, I do hope that we're only
going to cause queries to fail if they would affect the partition that's
being detached and not other partitions in the table. Or maybe because
of the replanning on invalidation this doesn't matter as much as I think
it does.

> - 0003 doesn't have any handling for parallel query at this point, so
> even though within a single backend a single query execution will
> always get the same PartitionDesc for the same relation, the answers
> might not be consistent across the parallel group.

That doesn't sound good. I think the easiest would be to just serialize
the PartitionDesc and send it to the workers instead of them recomputing
it, but then I worry that this might have bad performance when the
partition desc is large. (Or maybe sending bytes over pqmq is faster
than reading all those catalog entries and so this isn't a concern
anyway.)

> - 0003 also changes the order in which locks are acquired. I am not
> sure whether we care about this, especially in view of other pending
> changes.

Yeah, the drawbacks of the unpredictable locking order are worrisome,
but then the performance gain is hard to dismiss. Not this patch only
but the others too. If we're okay with the others going in, I guess we
don't have concerns about this one either.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2019-01-31 23:04:20 Re: Delay locking partitions during query execution
Previous Message Alvaro Herrera 2019-01-31 22:45:02 Re: Using POPCNT and other advanced bit manipulation instructions