Re: Adding support for Default partition in partitioning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Keith Fiske <keith(at)omniti(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-04-05 15:19:31
Message-ID: CA+TgmoYGa2U3W5E25V7bUNsjaLYkOfrC0QiJ69DH3shJmCsJPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 5, 2017 at 5:57 AM, Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
>>Could you briefly elaborate why you think the lack global index support
>>would be a problem in this regard?
> I think following can happen if we allow rows satisfying the new partition
> to lie around in the
> default partition until background process moves it.
> Consider a scenario where partition key is a primary key and the data in the
> default partition is
> not yet moved into the newly added partition. If now, new data is added into
> the new partition
> which also exists(same key) in default partition there will be data
> duplication. If now
> we scan the partitioned table for that key(from both the default and new
> partition as we
> have not moved the rows) it will fetch the both rows.
> Unless we have global indexes for partitioned tables, there is chance of
> data duplication between
> child table added after default partition and the default partition.

Yes, I think it would be completely crazy to try to migrate the data
in the background:

- The migration might never complete because of a UNIQUE or CHECK
constraint on the partition to which rows are being migrated.

- Even if the migration eventually succeeded, such a design abandons
all hope of making INSERT .. ON CONFLICT DO NOTHING work sensibly
while the migration is in progress, unless the new partition has no
UNIQUE constraints.

- Partition-wise join and partition-wise aggregate would need to have
special case handling for the case of an unfinished migration, as
would any user code that accesses partitions directly.

- More generally, I think users expect that when a DDL command
finishes execution, it's done all of the work that there is to do (or
at the very least, that any remaining work has no user-visible
consequences, which would not be the case here).

IMV, the question of whether we have efficient ways to move data
around between partitions is somewhat separate from the question of
whether partitions can be defined in a certain way in the first place.
The problems that Keith refers to upthread already exist for
subpartitioning; you've got to detach the old partition, create a new
one, and then reinsert the data. And for partitioning an
unpartitioned table: create a replacement table, insert all the data,
substitute it for the original table. The fact that we have these
limitation is not good, but we're not going to rip out partitioning
entirely because we don't have clever ways of migrating the data in
those cases, and the proposed behavior here is not any worse.

Also, waiting for those problems to get fixed might be waiting for
Godot. I'm not really all that sanguine about our chances of coming
up with a really nice way of handling these cases. In a designed
based on table inheritance, you can leave it murky where certain data
is supposed to end up and migrate it on-line and you might get away
with that, but a major point of having declarative partitioning at all
is to remove that sort of murkiness. It's probably not that hard to
come up with a command that locks the parent and moves data around via
full table scans, but I'm not sure how far that really gets us; you
could do the same thing easily enough with a sequence of commands
generated via a script. And being able to do this in a general way
without a full table lock looks pretty hard - it doesn't seem
fundamentally different from trying to perform a table-rewriting
operation like CLUSTER without a full table lock, which we also don't
support. The executor is not built to cope with any aspect of the
table definition shifting under it, and that includes the set of child
tables with are partitions of the table mentioned in the query. Maybe
the executor can be taught to survive such definitional changes at
least in limited cases, but that's a much different project than
allowing default partitions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-05 15:25:46 Re: partitioned tables and contrib/sepgsql
Previous Message Stephen Frost 2017-04-05 15:13:33 Re: Rewriting the test of pg_upgrade as a TAP test