Re: What needs to be done for real Partitioning?

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PFC <lists(at)boutiquenumerique(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: What needs to be done for real Partitioning?
Date: 2005-03-20 23:14:59
Message-ID: 87fyypgb9o.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:

> Well, I would think that specifying an expression that defines a new partition
> at each change in value (like EXTRACT(day FROM timestamp) on a time-based
> partitioning) would cover 90% of implemenations and be a lot simpler to
> administer. The Oracle approach has the advantage of allowing "custom
> paritioning" at the expense of greater complexity.

Hm. This is where I might be less helpful. Once you're submersed in one way of
doing things it can be hard to think outside the box like this.

But I fear this scheme might be harder to actually take advantage of. If I do
a query like

WHERE timestamp BETWEEN '2005-01-01 11:00' AND '2005-01-01 12:00'

How do you determine which partitions that range will cover?

Also, it seems like it would be inconvenient to try to construct expressions
to handle things like "start a new partition ever 1 million values".

And worse, how would you handle changing schemes with this? Like, say we want
to switch from starting one partition per month to starting one partition per
week?

I think some actual use cases might be helpful for you. I can contribute an
interesting one, though I have to be intentionally vague even though I don't
work on that system any more.

We had a table with a layout like:

txnid serial,
groupid integer,
data...

Each day a cron job created 6 new groups (actually later that was changed to
some other number). It then added a new partition to handle the range of the
new day's groups. Later another cron job exchanged out the partition from a
week earlier and exported that table, transfered it to another machine and
loaded it there.

txnid was a unique identifier but we couldn't have a unique constraint because
that would have required a global index. That didn't cause any problems since
it was a sequence generated column anyways.

We did have a unique index on <groupid,txnid> which is a local index because
groupid was the partition key. In reality nothing in our system ever really
needed a txn without knowing which group it came from anyways, so it was easy
to change our queries to take advantage of this.

We had a lot of jobs, some *extremely* performance sensitive that depended on
being able to scan the entire list of txns for a given day or a given set of
groupids. The partitions meant it could do a full table scan which made these
extremely fast.

This was with Oracle 8i. All partition keys in 8i were ranges. In 9 Oracle
added the ability to make partition reference specific id values. Sort of like
how you're describing having a key expression. We might have considered using
that scheme with groupid but then it would have meant adding a bunch of new
partitions each day and having some queries that would involve scanning
multiple partitions.

--
Greg

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Stark 2005-03-21 03:33:12 Re: What needs to be done for real Partitioning?
Previous Message Tom Lane 2005-03-20 23:05:25 Re: What needs to be done for real Partitioning?