Re: On partitioning

From: Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: On partitioning
Date: 2014-08-29 18:32:05
Message-ID: 5400C725.3000600@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/29/2014 05:56 PM, Alvaro Herrera wrote:
> Prompted by a comment in the UPDATE/LIMIT thread, I saw Marko Tiikkaja
> reference Tom's post
> http://www.postgresql.org/message-id/1598.1399826841@sss.pgh.pa.us
> which mentions the possibility of a different partitioning
> implementation than what we have so far. As it turns out, I've been
> thinking about partitioning recently, so I thought I would share what
> I'm thinking so that others can poke holes. My intention is to try to
> implement this as soon as possible.
>
>
> Declarative partitioning
> ========================
> ...
> Still To Be Designed
> --------------------
> * Dependency issues
> * Are indexes/constraints inherited from the parent rel?
I'd say mostly yes.

There could some extra "constraint exclusion type" magic for
conditional indexes, but the rest probably should come from "main table"

And there should be some kind of cross-partition indexes. At
"partitioning" capability, this can probably wait for version 2.

> * Multiple keys?
Why not. But probably just for hash partitioning.
> Subpartitioning?
Probably not. If you need speed for huge numbers of partitions, use
Gregs idea of keeping the partitions in a tree (or just having a
partition index).
> Hash partitioning?
At some point definitely.

Also one thing you left unmentioned is dropping (and perhaps also
truncating)
a partition. We still may want to do historic data management the same way
we do it now, by just getting rid of the whole partition or its data.

At some point we may also want to do redistributing data between
partitions,
maybe for case where we end up with 90% of the data in on partition due to
bad partitioning key or partitioning function choice. This is again
something
that is hard now and can therefore be left to a later version.

> Open Questions
> --------------
>
> * What's the syntax to refer to specific partitions within a partitioned
> table?
> We could do "TABLE <xyz> PARTITION <n>", but for example if in
> the future we add hash partitioning, we might need some non-integer
> addressing (OTOH assigning sequential numbers to hash partitions doesn't
> seem so bad). Discussing with users of other DBMSs partitioning feature,
> one useful phrase is "TABLE <xyz> PARTITION FOR <value>".
Or more generally

TABLE <xyz> PARTITION FOR/WHERE col1=val1, col2=val2, ...;

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2014-08-29 18:52:42 Re: [TODO] Track number of files ready to be archived in pg_stat_archiver
Previous Message Tomas Vondra 2014-08-29 18:23:27 Re: 9.5: Better memory accounting, towards memory-bounded HashAgg