Re: Adding support for Default partition in partitioning

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Keith Fiske <keith(at)omniti(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-04-06 05:17:13
Message-ID: CAGPqQf19qAKOyn5aJLgoi7UDFwPVe=PoHzUhdk2Aij56GK4qEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/04/06 0:19, Robert Haas wrote:
> On Wed, Apr 5, 2017 at 5:57 AM, Rahila Syed <rahilasyed90(at)gmail(dot)com>
wrote:
>>> Could you briefly elaborate why you think the lack global index support
>>> would be a problem in this regard?
>> I think following can happen if we allow rows satisfying the new
partition
>> to lie around in the
>> default partition until background process moves it.
>> Consider a scenario where partition key is a primary key and the data in
the
>> default partition is
>> not yet moved into the newly added partition. If now, new data is added
into
>> the new partition
>> which also exists(same key) in default partition there will be data
>> duplication. If now
>> we scan the partitioned table for that key(from both the default and new
>> partition as we
>> have not moved the rows) it will fetch the both rows.
>> Unless we have global indexes for partitioned tables, there is chance of
>> data duplication between
>> child table added after default partition and the default partition.
>
> Yes, I think it would be completely crazy to try to migrate the data
> in the background:
>
> - The migration might never complete because of a UNIQUE or CHECK
> constraint on the partition to which rows are being migrated.
>
> - Even if the migration eventually succeeded, such a design abandons
> all hope of making INSERT .. ON CONFLICT DO NOTHING work sensibly
> while the migration is in progress, unless the new partition has no
> UNIQUE constraints.
>
> - Partition-wise join and partition-wise aggregate would need to have
> special case handling for the case of an unfinished migration, as
> would any user code that accesses partitions directly.
>
> - More generally, I think users expect that when a DDL command
> finishes execution, it's done all of the work that there is to do (or
> at the very least, that any remaining work has no user-visible
> consequences, which would not be the case here).

Thanks Robert for this explanation. This makes it more clear, why row
movement by background is not sensible idea.

On Thu, Apr 6, 2017 at 9:38 AM, Keith Fiske <keith(at)omniti(dot)com> wrote:

>
> On Wed, Apr 5, 2017 at 2:51 PM, Keith Fiske <keith(at)omniti(dot)com> wrote:
>
>>
>>
>> Only issue I see with this, and I'm not sure if it is an issue, is what
>> happens to that default constraint clause when 1000s of partitions start
>> getting added? From what I gather the default's constraint is built based
>> off the cumulative opposite of all other child constraints. I don't
>> understand the code well enough to see what it's actually doing, but if
>> there are no gaps, is the method used smart enough to aggregate all the
>> child constraints to make a simpler constraint that is simply outside the
>> current min/max boundaries? If so, for serial/time range partitioning this
>> should typically work out fine since there are rarely gaps. This actually
>> seems more of an issue for list partitioning where each child is a distinct
>> value or range of values that are completely arbitrary. Won't that check
>> and re-evaluation of the default's constraint just get worse and worse as
>> more children are added? Is there really even a need for the default to
>> have an opposite constraint like this? Not sure on how the planner works
>> with partitioning now, but wouldn't it be better to first check all
>> non-default children for a match the same as it does now without a default
>> and, failing that, then route to the default if one is declared? The
>> default should accept any data then so I don't see the need for the
>> constraint unless it's required for the current implementation. If that's
>> the case, could that be changed?
>>
>> Keith
>>
>
> Actually, thinking on this more, I realized this does again come back to
> the lack of a global index. Without the constraint, data could be put
> directly into the default that could technically conflict with the
> partition scheme elsewhere. Perhaps, instead of the constraint, inserts
> directly to the default could be prevented on the user level. Writing to
> valid children directly certainly has its place, but been thinking about
> it, and I can't see any reason why one would ever want to write directly to
> the default. It's use case seems to be around being a sort of temporary
> storage until that data can be moved to a valid location. Would still need
> to allow removal of data, though.
>
> Not sure if that's even a workable solution. Just trying to think of ways
> around the current limitations and still allow this feature.
>

I like the idea about having DEFAULT partition for the range partition.
With the
way partition is designed it can have holes into range partition. I think
DEFAULT
for the range partition is a good idea, generally when the range having
holes. When
range is serial then of course DEFAULT partition doen't much sense.

Regarda,

Rushabh Lathia
www.EnterpriseDB.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-04-06 05:18:51 Re: Adding support for Default partition in partitioning
Previous Message Ashutosh Sharma 2017-04-06 05:15:26 Re: Add pgstathashindex() to get hash index table statistics.