Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: <pgsql-performance(at)postgresql(dot)org>, "Mark Cotner" <mcotner(at)yahoo(dot)com>, "Christopher Browne" <cbbrowne(at)acm(dot)org>
Subject: Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
Date: 2004-09-14 23:32:11
Message-ID: NOEFLCFHBPDAFHEIPGBOEEHPCEAA.simon@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Jim C. Nasby
> On Mon, Sep 13, 2004 at 11:07:35PM +0100, Simon Riggs wrote:
> > PostgreSQL's functionality is in many ways similar to Oracle
> Partitioning.
> >
> > Loading up your data in many similar tables, then creating a view like:
> >
> > CREATE VIEW BIGTABLE (idate, col1, col2, col3...) AS
> > SELECT 200409130800, col1, col2, col3... FROM table200409130800
> > UNION ALL
> > SELECT 200409131000, col1, col2, col3... FROM table200409131000
> > UNION ALL
> > SELECT 200409131200, col1, col2, col3... FROM table200409131200
> > ...etc...
> >
> > will allow the PostgreSQL optimizer to eliminate partitions
> from the query
> > when you run queries which include a predicate on the
> partitioning_col, e.g.
> >
> > select count(*) from bigtable where idate >= 200409131000
> >
> > will scan the last two partitions only...
> >
> > There are a few other ways of creating the view that return the
> same answer,
> > but only using constants in that way will allow the partitions to be
> > eliminated from the query, and so run for much longer.
>
> Is there by any chance a set of functions to manage adding and removing
> partitions? Certainly this can be done by hand, but having a set of
> tools would make life much easier. I just looked but didn't see anything
> on GBorg.

Well, its fairly straightforward to auto-generate the UNION ALL view, and
important as well, since it needs to be re-specified each time a new
partition is loaded or an old one is cleared down. The main point is that
the constant placed in front of each table must in some way relate to the
data, to make it useful in querying. If it is just a unique constant, chosen
at random, it won't do much for partition elimination. So, that tends to
make the creation of the UNION ALL view an application/data specific thing.

The "partitions" are just tables, so no need for other management tools.
Oracle treats the partitions as sub-tables, so you need a range of commands
to add, swap etc the partitions of the main table.

I guess a set of tools that emulates that functionality would be generically
a good thing, if you can see a way to do that.

Oracle partitions were restricted in only allowing a single load statement
into a single partition at any time, whereas multiple COPY statements can
access a single partition table on PostgreSQL.

BTW, multi-dimensional partitioning is also possible using the same general
scheme....

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mischa Sandberg 2004-09-14 23:32:48 Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
Previous Message Mischa Sandberg 2004-09-14 22:58:20 Re: Data Warehouse Reevaluation - MySQL vs Postgres --