Re: Dynamic Partitioning using Segment Visibility Maps

From: Gavin Sherry <swm(at)alcove(dot)com(dot)au>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-09 06:28:59
Message-ID: 20080109062859.GI6934@europa.idg.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 02, 2008 at 05:56:14PM +0000, Simon Riggs wrote:
> This technique would be useful for any table with historical data keyed
> by date or timestamp. It would also be useful for data where a
> time-of-insert component is implicit, such as many major entity tables
> where the object ids are assigned by a sequence. e.g. an Orders table
> with an OrderId as PK. Once all orders placed in a period have been
> shipped/resolved/closed then the segments will be marked read-only.
>
> Its not really going to change the code path much for small tables, yet
> seems likely to work reasonably well for large tables of all shapes and
> sizes. If a segment is being updated, we leave it alone, and maybe never
> actually set the visibility map at all. So overall, this idea seems to
> cover the main use case well, yet with only minimal impact on the
> existing use cases we support.
>
>
> As before, I will maintain this proposal on the PG developer Wiki, once
> we get down to detailed design.
>
>
>
> Like it?

Simon,

A novel approach to the problem. For me, this proposal addresses some
of the other problems in postgres (visibility in the heap vs. index)
rather than the problems with partitioning itself.

It might seem otherwise, but for me partitioning is tool for managing
large volumes of data. It allows the user to encode what they know about
the nature of their data, and that's often about management. In this
way, your proposal actually makes partitioning too smart: the user wants
to tell the system how to organise the data, as opposed to the other way
around.

At Greenplum, we've been discussing this in depth. Interestingly, we
also felt that the storage layer could be much smarter with large tables
with predictable layouts and predictable data patterns. But the thing
is, people with large tables like partitioning, putting different
partitions on different storage; they like the ability to merge and
split partitions; they like truncating old partitions. In a word, they
seem to like the managability partitions give them -- as well as the
performance that comes with this.

To this end, we (well, Jeff Cohen) looked at the syntax and semantics of
partitining in leading databases (Oracle, Informix, DB2) and came up
with a highly expressive grammar which takes the best of each I think
(I'll post details on the grammar in a seperate thread). The idea is
that range (for example, a date range), list (a list of distinct values)
and hash partitioning be supported on multiple columns. Partitions can
be added, merged, truncated. Partitions can reside on different
tablespaces. The existing issues with the rewriter, COPY support and the
like go away by smartening up the backend.

To explore the grammar and semantics Jeff and I (to a small extent) have
implemented the parsing of such a grammar in the backend. At the moment,
this just results in the generation of a bunch of rules (i.e., it
produces the current behaviour, as if someone had written rules
themselves) and debugging output.

The fully fledged approach will see partitioning rules stored in
a new system catalog which can be interrogated by the planner or COPY to
determine which partition a given tuple belongs in or which partition(s)
a WHERE clause matches.

Yes, this is the traditional approach but I think it still has merit. We
shall continue working on this approach because it is what our customers
have asked for. We would also like to see it get into postgres too.

Thanks,

Gavin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sokolov Yura 2008-01-09 07:44:11 Re: BUG #3852: Could not create complex aggregate
Previous Message Tom Lane 2008-01-09 05:22:04 Some notes about the index-functions security vulnerability