Re: Adding support for Default partition in partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Keith Fiske <keith(at)omniti(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rahila Syed <rahilasyed90(at)gmail(dot)com>, Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-04-06 05:18:51
Message-ID: eeda9687-1b0c-5f17-50a6-632d59d01259@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/04/06 13:08, Keith Fiske wrote:
> On Wed, Apr 5, 2017 at 2:51 PM, Keith Fiske wrote:
>> Only issue I see with this, and I'm not sure if it is an issue, is what
>> happens to that default constraint clause when 1000s of partitions start
>> getting added? From what I gather the default's constraint is built based
>> off the cumulative opposite of all other child constraints. I don't
>> understand the code well enough to see what it's actually doing, but if
>> there are no gaps, is the method used smart enough to aggregate all the
>> child constraints to make a simpler constraint that is simply outside the
>> current min/max boundaries? If so, for serial/time range partitioning this
>> should typically work out fine since there are rarely gaps. This actually
>> seems more of an issue for list partitioning where each child is a distinct
>> value or range of values that are completely arbitrary. Won't that check
>> and re-evaluation of the default's constraint just get worse and worse as
>> more children are added? Is there really even a need for the default to
>> have an opposite constraint like this? Not sure on how the planner works
>> with partitioning now, but wouldn't it be better to first check all
>> non-default children for a match the same as it does now without a default
>> and, failing that, then route to the default if one is declared? The
>> default should accept any data then so I don't see the need for the
>> constraint unless it's required for the current implementation. If that's
>> the case, could that be changed?

Unless I misread your last sentence, I think there might be some
confusion. Currently, the partition constraint (think of these as you
would of user-defined check constraints) is needed for two reasons: 1. to
prevent direct insertion of rows into the default partition for which a
non-default partition exists; no two partitions should ever have duplicate
rows. 2. so that planner can use the constraint to determine if the
default partition needs to be scanned for a query using constraint
exclusion; no need, for example, to scan the default partition if the
query requests only key=3 rows and a partition for the same exists (no
other partition should have key=3 rows by definition, not even the
default). As things stand today, planner needs to look at every partition
individually for using constraint exclusion to possibly exclude it, *even*
with declarative partitioning and that would include the default partition.

> Actually, thinking on this more, I realized this does again come back to
> the lack of a global index. Without the constraint, data could be put
> directly into the default that could technically conflict with the
> partition scheme elsewhere. Perhaps, instead of the constraint, inserts
> directly to the default could be prevented on the user level. Writing to
> valid children directly certainly has its place, but been thinking about
> it, and I can't see any reason why one would ever want to write directly to
> the default. It's use case seems to be around being a sort of temporary
> storage until that data can be moved to a valid location. Would still need
> to allow removal of data, though.

As mentioned above, the default partition will not allow directly
inserting a row whose key maps to some existing (non-default) partition.

As far as tuple-routing is concerned, it will choose the default partition
only if no other partition is found for the key. Tuple-routing doesn't
use the partition constraints directly per se, like one of the two things
mentioned above do. One could say that tuple-routing assigns the incoming
rows to partitions such that their individual partition constraints are
not violated.

Finally, we don't yet offer global guarantees for constraints like unique.
The only guarantee that's in place is that no two partitions can contain
the same partition key.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-04-06 05:19:58 Re: Multiple false-positive warnings from Valgrind
Previous Message Rushabh Lathia 2017-04-06 05:17:13 Re: Adding support for Default partition in partitioning