Re: Adding support for Default partition in partitioning

From: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Keith Fiske <keith(at)omniti(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-04-05 05:41:50
Message-ID: CAGPqQf3Zb16C13MqHW4C3OjjTBtyWh1UPenZmYCj0TrxaeSTJQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 5, 2017 at 10:59 AM, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
> wrote:

> On 2017/04/05 6:22, Keith Fiske wrote:
> > On Tue, Apr 4, 2017 at 9:30 AM, Rahila Syed wrote:
> >> Please find attached an updated patch.
> >> Following has been accomplished in this update:
> >>
> >> 1. A new partition can be added after default partition if there are no
> >> conflicting rows in default partition.
> >> 2. Solved the crash reported earlier.
> >
> > Installed and compiled against commit
> > 60a0b2ec8943451186dfa22907f88334d97cb2e0 (Date: Tue Apr 4 12:36:15 2017
> > -0400) without any issue
> >
> > However, running your original example, I'm getting this error
> >
> > keith(at)keith=# CREATE TABLE list_partitioned (
> > keith(# a int
> > keith(# ) PARTITION BY LIST (a);
> > CREATE TABLE
> > Time: 4.933 ms
> > keith(at)keith=# CREATE TABLE part_default PARTITION OF list_partitioned
> FOR
> > VALUES IN (DEFAULT);
> > CREATE TABLE
> > Time: 3.492 ms
> > keith(at)keith=# CREATE TABLE part_1 PARTITION OF list_partitioned FOR
> VALUES
> > IN (4,5);
> > ERROR: unrecognized node type: 216
>
> It seems like the new ExecPrepareCheck should be used in the place of
> ExecPrepareExpr in the code added in check_new_partition_bound().
>
> > Also, I'm still of the opinion that denying future partitions of values
> in
> > the default would be a cause of confusion. In order to move the data out
> of
> > the default and into a proper child it would require first removing that
> > data from the default, storing it elsewhere, creating the child, then
> > moving it back. If it's only a small amount of data it may not be a big
> > deal, but if it's a large amount, that could cause quite a lot of
> > contention if done in a single transaction. Either that or the user would
> > have to deal with data existing in the table, disappearing, then
> > reappearing again.
> >
> > This also makes it harder to migrate an existing table easily.
> Essentially
> > no child tables for a large, existing data set could ever be created
> > without going through one of the two situations above.
>
> I thought of the following possible way to allow future partitions when
> the default partition exists which might contain rows that belong to the
> newly created partition (unfortunately, nothing that we could implement at
> this point for v10):
>
> Suppose you want to add a new partition which will accept key=3 rows.
>
> 1. If no default partition exists, we're done; no key=3 rows would have
> been accepted by any of the table's existing partitions, so no need to
> move any rows
>
> 2. Default partition exists which might contain key=3 rows, which we'll
> need to move. If we do this in the same transaction, as you say, it
> might result in unnecessary unavailability of table's data. So, it's
> better to delegate that responsibility to a background process. The
> current transaction will only add the new key=3 partition, so any key=3
> rows will be routed to the new partition from this point on. But we
> haven't updated the default partition's constraint yet to say that it
> no longer contains key=3 rows (constraint that the planner consumes),
> so it will continue to be scanned for queries that request key=3 rows
> (there should be some metadata to indicate that the default partition's
> constraint is invalid), along with the new partition.
>
> 3. A background process receives a "work item" requesting it to move all
> key=3 rows from the default partition heap to the new partition's heap.
> The movement occurs without causing the table to become unavailable;
> once all rows have been moved, we momentarily lock the table to update
> the default partition's constraint to mark it valid, so that it will
> no longer be accessed by queries that want to see key=3 rows.
>
> Regarding 2, there is a question of whether it should not be possible for
> the row movement to occur in the same transaction. Somebody may want that
> to happen because they chose to run the command during a maintenance
> window, when the table's becoming unavailable is not an issue. In that
> case, we need to think of the interface more carefully.
>
> Regarding 3, I think the new autovacuum work items infrastructure added by
> the following commit looks very promising:
>
> * BRIN auto-summarization *
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=
> 7526e10224f0792201e99631567bbe44492bbde4
>
> > However, thinking through this, I'm not sure I can see a solution without
> > the global index support. If this restriction is removed, there's still
> an
> > issue of data duplication after the necessary child table is added. So
> > guess it's a matter of deciding which user experience is better for the
> > moment?
>
> I'm not sure about the fate of this particular patch for v10, but until we
> implement a solution to move rows and design appropriate interface for the
> same, we could error out if moving rows is required at all, like the patch
> does.
>
>
+1

I agree about the future plan about the row movement, how that is I am
not quite sure at this stage.

I was thinking that CREATE new partition is the DDL command, so even
if row-movement works with holding the lock on the new partition table,
that should be fine. I am not quire sure, why row movement should be
happen in the back-ground process.

Of-course, one idea is that if someone don't want feature of row-movement,
then we might add that under some GUC or may be as another option into
the CREATE partition table.

Could you briefly elaborate why you think the lack global index support
> would be a problem in this regard?
>
> I agree that some design is required here to implement a solution
> redistribution of rows; not only in the context of supporting the notion
> of default partitions, but also to allow the feature to split/merge range
> (only?) partitions. I'd like to work on the latter for v11 for which I
> would like to post a proposal soon; if anyone would like to collaborate
> (ideas, code, review), I look forward to. (sorry for hijacking this
> thread.)
>
> Thanks,
> Amit
>
>
>

--
Rushabh Lathia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2017-04-05 05:48:57 Re: Statement timeout behavior in extended queries
Previous Message Emre Hasegeli 2017-04-05 05:34:29 Re: BRIN cost estimate