Re: Declarative partitioning - another take

From: Francisco Olarte <folarte(at)peoplecall(dot)com>
To: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning - another take
Date: 2016-11-01 18:05:00
Message-ID: CA+bJJbxjXLTEnBsB6+0ogtk-v5hYMqT2kmXaBzzJukctYmOCSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Nov 1, 2016 at 6:49 PM, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> wrote:
>
> On Tue, Nov 1, 2016 at 12:57 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> For strings and numeric types that are not integers, there is in
>> theory a loss of power. If you want a partition that allows very
>> value starting with 'a' plus the string 'b' but not anything after
>> that, you are out of luck. START ('a') END ('b') INCLUSIVE would have
>> done exactly what you want, but now you need to store the first string
>> that you *don't* want to include in that partition, and what's that?
>> Dunno. Or similarly if you want to store everything from 1.0 up to
>> and including 2.0 but nothing higher, you can't, really.
> Exactly. This is especially true for date ranges. There's a lot of cognitive
> dissonance in defining the "2014" partition as < '2015-01-01', as was the
> case in Oracle waterfall-style partitioning. That was my reasoning for
> pushing for range-ish syntax as well as form.

OTOH I've seen a lot of people bitten by [2014-01-01,2014-12-31] on
TIMESTAMP intervals.

Everybody remembers december has 31 days, but when we have to do
MONTHLY partitions if you use closed intervals someone always miskeys
the number of days, or forgets wheter a particular year is leap or
not, and when doing it automatically I always have to code it as start
+ 1 month - 1day. In my experience having the non-significant part of
the dates ( days in monthly case, months too in yearly cases ) both 1
and equal in start and end makes it easier to check and identify, and
less error prone.

>> But who wants that? People who are doing prefix-based partitioning of
>> their text keys are going to want all of the 'a' things together, and
>> all of the 'b' things in another category. Same for ranges of
>> floating-point numbers, which are also probably an unlikely candidate
>> for a partitioning key anyway.
> /me raises hand. We have tables with a taxonomy in them where the even data
> splits don't fall on single letter boundaries, and often the single string
> values have more rows than entire letters. In those situations, being able
> to express ['XYZ','XYZ'] is important. ['XYZ,'XZ') would let 'XYZ1' bleed
> into the partition and ['XYZ','XYZ1') lets in other values, and so I go
> chasing down the non-discrete set rabbit hole.

You just do the classical ( I've had to do it ) closed end || minimum
char ( "XYZ","XYZ\0" in this case ). It is not that difficult as
strings have a global order, the next string to any one is always that
plus the \0, or whatever your minimum is.

The problem is with anything similar to a real number, but then there
I've always opted for half-open interval, as they can cover the line
without overlapping, unlike closed ones.

Anyway, as long as anyone makes sure HALF-OPEN intervals are allowed,
I'm fine ( I do not remember the name, but once had to work with a
system that only allowed closed or open and it was a real PITA.

Francisco Olarte.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-11-01 18:09:26 Re: Declarative partitioning - another take
Previous Message Robert Haas 2016-11-01 18:01:55 Re: Declarative partitioning - another take