Re: On partitioning

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, 'Robert Haas' <robertmhaas(at)gmail(dot)com>
Cc: 'Andres Freund' <andres(at)2ndquadrant(dot)com>, 'Alvaro Herrera' <alvherre(at)2ndquadrant(dot)com>, 'Bruce Momjian' <bruce(at)momjian(dot)us>, 'Pg Hackers' <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On partitioning
Date: 2014-12-03 08:16:19
Message-ID: 547EC6D3.1030309@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/2/14, 9:43 PM, Amit Langote wrote:
>> >What is an overflow partition and why do we want that?
>> >
> That would be a default partition. That is, where the tuples that don't belong elsewhere (other defined partitions) go. VALUES clause of the definition for such a partition would look like:
>
> (a range partition) ... VALUES LESS THAN MAXVALUE
> (a list partition) ... VALUES DEFAULT
>
> There has been discussion about whether there shouldn't be such a place for tuples to go. That is, it should generate an error if a tuple can't go anywhere (or support auto-creating a new one like in interval partitioning?)

If we are going to do this, should the data just go into the parent? That's what would happen today.

FWIW, I think an overflow would be useful, but there should be a way to (dis|en)able it.

>> >What are you going to do if the partitioning key has two columns of
>> >different data types?
>> >
> Sorry, this totally eluded me. Perhaps, the 'values' needs some more thought. They are one of the most crucial elements of the scheme.
>
> I wonder if your suggestion of pg_node_tree plays well here. This then could be a list of CONSTs or some such... And I am thinking it's a concern only for range partitions, no? (that is, a multicolumn partition key)
>
> I think partkind switches the interpretation of the field as appropriate. Am I missing something? By the way, I had mentioned we could have two values fields each for range and list partition kind.

The more SQL way would be records (composite types). That would make catalog inspection a LOT easier and presumably make it easier to change the partitioning key (I'm assuming ALTER TYPE cascades to stored data). Records are stored internally as tuples; not sure if that would be faster than a List of Consts or a pg_node_tree. Nodes would theoretically allow using things other than Consts, but I suspect that would be a bad idea.

Something else to consider... our user-space support for ranges is now rangetypes, so perhaps that's what we should use for range partitioning. The up-side (which would be a double-edged sword) is that you could leave holes in your partitioning map. Note that in the multi-key case we could still have a record of rangetypes.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2014-12-03 09:29:20 Re: Removing INNER JOINs
Previous Message Michael Paquier 2014-12-03 07:39:53 Re: using custom scan nodes to prototype parallel sequential scan