Re: On partitioning

From: "Amit Langote" <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Amit Kapila'" <amit(dot)kapila16(at)gmail(dot)com>
Cc: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, "'Andres Freund'" <andres(at)2ndquadrant(dot)com>, "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On partitioning
Date: 2014-12-09 07:19:11
Message-ID: 004a01d01380$6e3038a0$4a90a9e0$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, Dec 9, 2014 at 12:59 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Dec 9, 2014 at 8:08 AM, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
> wrote:
>> > From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
>> > I don't understand. If you want to range partition on columns (a, b),
>> > you say that, say, tuples with (a, b) values less than (100, 200) go
>> > here and the rest go elsewhere. For list partitioning, you say that,
>> > say, tuples with (a, b) values of EXACTLY (100, 200) go here and the
>> > rest go elsewhere. I'm not sure how useful that is but it's not
>> > illogical.
>> >
>>
>> In case of list partitioning, 100 and 200 would respectively be one of the
>> values in lists of allowed values for a and b. I thought his concern is
>> whether this "list of values for each column in partkey" is as convenient to
>> store and manipulate as range partvalues.
>>
>
> Yeah and also how would user specify the values, as an example
> assume that table is partitioned on monthly_salary, so partition
> definition would look:
>
> PARTITION BY LIST(monthly_salary)
> (
> PARTITION salary_less_than_thousand VALUES(300, 900),
> PARTITION salary_less_than_two_thousand VALUES (500,1000,1500),
> ...
> )
>
> Now if user wants to define multi-column Partition based on
> monthly_salary and annual_salary, how do we want him to
> specify the values. Basically how to distinguish which values
> belong to first column key and which one's belong to second
> column key.
>

Amit, in one of my earlier replies to your question of why we may not want to implement multi-column list partitioning (lack of user interest in the feature or possible complexity of the code), I tried to explain how that may work if we do choose to go that way. Basically, something we may call PartitionColumnValue should be such that above issue can be suitably sorted out.

For example, a partition defining/bounding value would be a pg_node_tree representation of List of one of the (say) following parse nodes as appropriate -

typedef struct PartitionColumnValue
{
NodeTag type,
Oid *partitionid,
char *partcolname,
char partkind,
Node *partrangelower,
Node *partrangeupper,
List *partlistvalues
};

OR separately,

typedef struct RangePartitionColumnValue
{
NodeTag type,
Oid *partitionid,
char *partcolname,
Node *partrangelower,
Node *partrangeupper
};

&

typedef struct ListPartitionColumnValue
{
NodeTag type,
Oid *partitionid,
char *partcolname,
List *partlistvalues
};

Where a partition definition would look like

typedef struct PartitionDef
{
NodeTag type,
RangeVar partition,
RangeVar parentrel,
char *kind,
Node *values,
List *options,
char *tablespacename
};

PartitionDef.values is an (ordered) List of PartitionColumnValue each of which corresponds to one column in the partition key in that order.

We should be able to devise a way to load the pg_node_tree representation of PartitionDef.values (on-disk pg_partition_def.partvalues) into relcache using a "suitable data structure" so that it becomes readily usable in variety of contexts that we are interested in using this information.

Regards,
Amit

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2014-12-09 07:28:41 Re: Misunderstanding on the FSM README file
Previous Message Amit Kapila 2014-12-09 05:46:55 Re: Parallel Seq Scan