Re: RangeType internal use

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: RangeType internal use
Date: 2015-02-10 00:54:32
Message-ID: 54D956C8.7010800@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10-02-2015 AM 02:37, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, Feb 9, 2015 at 10:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> It's going to be complicated and probably buggy, and I think it is heading
>>> in the wrong direction altogether. If you want to partition in some
>>> arbitrary complicated fashion that the system can't reason about very
>>> effectively, we *already have that*. IMO the entire point of building
>>> a new partitioning infrastructure is to build something simple, reliable,
>>> and a whole lot faster than what you can get from inheritance
>>> relationships. And "faster" is going to come mainly from making the
>>> partitioning rules as simple as possible, not as complex as possible.
>
>> Yeah, but people expect to be able to partition on ranges that are not
>> all of equal width. I think any proposal that we shouldn't support
>> that is the kiss of death for a feature like this - it will be so
>> restricted as to eliminate 75% of the use cases.
>
> Well, that's debatable IMO (especially your claim that variable-size
> partitions would be needed by a majority of users). But in any case,
> partitioning behavior that is emergent from a bunch of independent pieces
> of information scattered among N tables seems absolutely untenable from
> where I sit. Whatever we support, the behavior needs to be described by
> *one* chunk of information --- a sorted list of bin bounding values,
> perhaps.
>

I'm a bit confused here. I got an impression that partitioning formula
as you suggest would consist of two pieces of information - an origin
point & a bin width. Then routing a tuple consists of using exactly
these two values to tell a bin number and hence a partition in O(1) time
assuming we've made all partitions be exactly bin-width wide.

You mention here a sorted list of bin bounding values which we can very
well put together for a partitioned table in its relation descriptor
based on whatever information we stored in catalog. That is, we can
always have a *one* chunk of partitioning information as *internal*
representation irrespective of how generalized we make our on-disk
representation. We can get O(log N) if not O(1) from that I'd hope. In
fact, that's what I had in mind about this.

Did I read it wrong?

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-02-10 00:57:57 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Peter Geoghegan 2015-02-10 00:21:20 Re: INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0