Re: Multi column range partition table

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, amul sul <sulamul(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi column range partition table
Date: 2017-07-05 14:18:49
Message-ID: CAEZATCU00n6=9dFiW57Hi1wX815Vrek-v9njC29XgVKfMw7q7g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5 July 2017 at 10:43, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> So the more I think about this, the more I think that a cleaner design
>> would be as follows:
>>
>> 1). Don't allow UNBOUNDED, except in the first column, where it can
>> keep it's current meaning.
>>
>> 2). Allow the partition bounds to have fewer columns than the
>> partition definition, and have that mean the same as it would have
>> meant if you were partitioning by that many columns. So, for
>> example, if you were partitioning by (col1,col2), you'd be allowed
>> to define a partition like so:
>>
>> FROM (x) TO (y)
>>
>> and it would mean
>>
>> x <= col1 < y
>>
>> Or you'd be able to define a partition like
>>
>> FROM (x1,x2) TO (y)
>>
>> which would mean
>>
>> (col1 > x1) OR (col1 = x1 AND col2 >= x2) AND col1 < y
>>
>> 3). Don't allow any value after UNBOUNDED (i.e., only specify
>> UNBOUNDED once in a partition bound).
>
> I assume we don't need the ability of specifying ABOVE/BELOW in this design.
>

Yes that's right.

> In retrospect, that sounds like something that was implemented in the
> earlier versions of the patch, whereby there was no ability to specify
> UNBOUNDED on a per-column basis. So the syntax was:
>
> FROM { (x [, ...]) | UNBOUNDED } TO { (y [, ...]) | UNBOUNDED }
>

Yes, that's where I ended up too.

> But, it was pointed out to me [1] that that doesn't address the use case,
> for example, where part1 goes up to (10, 10) and part2 goes from (10, 10)
> up to (10, unbounded).
>
> The new design will limit the usage of unbounded range partitions at the
> tail ends.
>

True, but I don't think that's really a problem. When the first column
is a discrete type, an upper bound of (10, unbounded) can be rewritten
as (11) in the new design. When it's a continuous type, e.g. floating
point, it can no longer be represented, because (10.0, unbounded)
really means (col1 <= 10.0). But we've already decided not to support
anything other than inclusive lower bounds and exclusive upper bounds,
so allowing this upper bound goes against that design choice.

>> Of course, it's pretty late in the day to be proposing this kind of
>> redesign, but I fear that if we don't tackle it now, it will just be
>> harder to deal with in the future.
>>
>> Actually, a quick, simple hacky implementation might be to just fill
>> in any omitted values in a partition bound with negative infinity
>> internally, and when printing a bound, omit any values after an
>> infinite value. But really, I think we'd want to tidy up the
>> implementation, and I think a number of things would actually get much
>> simpler. For example, get_qual_for_range() could simply stop when it
>> reached the end of the list of values for the bound, and it wouldn't
>> need to worry about an unbounded value following a bounded one.
>>
>> Thoughts?
>
> I cooked up a patch for the "hacky" implementation for now, just as you
> described in the above paragraph. Will you be willing to give it a look?
> I will also think about the non-hacky way of implementing this.
>

OK, I'll take a look.

Meanwhile, I already had a go at the "non-hacky" implementation (WIP
patch attached). The more I worked on it, the simpler things got,
which I think is a good sign.

Part-way through, I realised that the PartitionRangeDatum Node type is
no longer needed, because each bound value is now necessarily finite,
so the lowerdatums and upperdatums lists in a PartitionBoundSpec can
now be made into lists of Const nodes, making them match the
listdatums field used for LIST partitioning, and then a whole lot of
related code gets simplified.

It needed a little bit more code in partition.c to track individual
bound sizes, but there were a number of other places that could be
simplified, so overall this represents a reduction in the code size
and complexity.

It's not complete (e.g., no doc updates yet), but it passes all the
tests, and so far seems to work as I would expect.

Regards,
Dean

Attachment Content-Type Size
refactor-unbounded-range-partitions.patch text/x-patch 67.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-07-05 14:57:24 Re: outfuncs.c utility statement support
Previous Message Mark Dilger 2017-07-05 13:53:35 Re: Request more documentation for incompatibility of parallelism and plpgsql exec_run_select