Re: Dynamic Partitioning using Segment Visibility Maps

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-07 19:08:00
Message-ID: 20080107190800.GF18581@crankycanuck.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 07, 2008 at 07:16:35PM +0100, Markus Schiltknecht wrote:
>
> Does anything speak against letting the DBA handle partitions as relations?

Yes: it doesn't solve the problem I have, which is that I don't want to have
to manage a whole bunch of tables. I want one table, and I want to be able
to say, "That section is closed".

> Sure, there's value in Simon's proposal. But it has pretty strict
> requirements. IMO, it's pretty hard to say, if it would have helped at
> all for your cases. Any of them still available to check?

No, but one of your worries doesn't bother me:

> Remember the requirements: no single tuple in the segment may be
> significantly out of the average bounds. Otherwise, the min/max gets
> pretty useless and the segment can never be excluded.

The segment can never be excluded in a search on that key, yes. But
consider the likely cases we're looking at:

WHERE some_date >= '1999-01-01' AND some_date < '2001-01-01';
WHERE sequence_field BETWEEN 3000 AND 300000;

&c. These are the two obvious cases: you're searching for data in a given
date range or for primary (sometimes artificial) identifiers in a range,
and the source data increases (almost) monotonically. You have to do this
now anyway, because there's _some_ basis on which you're partitioning your
data; but today, you do this with a lot of fooling around with views and
nasty triggers that push data into the "right" table, assuming someone
doesn't screw it up.

> need to maintain CLUSTERed ordering, aren't there better ways? For
> example, you could use binary searching on the relation directly, much
> like with indices, instead of sequentially scanning on the CLUSTERed
> relation. That would even give us some sort of "indices with visibility".

I think this is a nice idea too :)

> Well, Postgres doesn't automatically create indices, for a counter example.

Yes, and it has no data-use analyser tools that automatically suggest
indexes, either. That's the sort of thing people coming from other (err,
"Other" ;-) products complain about, in fact.

> definitely has more information available, than the computer. A DBA
> (hopefully) knows future plans and emergency strategies for the storage
> system, for example.

Perhaps my jaundice comes from too much time spent in operational trenches,
but while good DBAs have some ideas about that, large numbers of them are
harried and overwhelmed just by the piles of work they already have.
Nevertheless, while what you say is true, I'm not sure what it has to do
with the present case. I don't think the current proposal is to address
partitioning across table spaces. It's to do with the way certain segments
of a table are interpreted by the system. It's undoubtedly true that this
strategy is of questionable utility for many kinds of use of PostgreSQL.
But it seems to offer very significant advantages for one use-pattern that
is very common.

That said, I am not trying to argue it should be adopted without poking at
its weaknesses. I just think it unfair to ask the proposal to address
problems it's not really aimed at.

A

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message ilanco 2008-01-07 19:16:29 ERROR: translation failed from server encoding to wchar_t
Previous Message Joshua D. Drake 2008-01-07 18:48:58 Re: Bug: Unreferenced temp tables disables vacuum to update xid