Re: Dynamic Partitioning using Segment Visibility Maps

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-07 18:16:35
Message-ID: 47826C83.3000306@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Andrew Sullivan wrote:
> On Sat, Jan 05, 2008 at 08:02:41PM +0100, Markus Schiltknecht wrote:
>> Well, management of relations is easy enough, known to the DBA and most
>> importantly: it already exists. Having to set up something which is
>> *not* tied to a relation complicates things just because it's an
>> additional concept.
>
> But we're already dealing with some complicated concepts.

Possibly, yes, but that's by far no reason to introduce even more
complicated concepts...

Does anything speak against letting the DBA handle partitions as relations?

> There isn't anything that will prevent current-style partitioning strategies
> from continuing to work in the face of Simon's proposal.

Agreed. Nor will Simon's proposal completely replace them.

> Without even trying, I can think of a dozen examples in the past 5 years
> where I could have used that sort of functionality. Because the cost of
> data retrieval was high enough, we had to decide that the question wasn't
> worth answering. Some of those answers might have been quite valuable
> indeed to the Internet community, to be frank; but because I had to pay the
> cost without getting much direct benefit, it just wasn't worth the effort.

Sure, there's value in Simon's proposal. But it has pretty strict
requirements. IMO, it's pretty hard to say, if it would have helped at
all for your cases. Any of them still available to check?

Remember the requirements: no single tuple in the segment may be
significantly out of the average bounds. Otherwise, the min/max gets
pretty useless and the segment can never be excluded.

As said before, combination with CLUSTERing might help, yes. But if you
need to maintain CLUSTERed ordering, aren't there better ways? For
example, you could use binary searching on the relation directly, much
like with indices, instead of sequentially scanning on the CLUSTERed
relation. That would even give us some sort of "indices with visibility".

>> Agreed. I'd say that's why the DBA needs to be able to define the split
>> point between partitions: only he knows the meaning of the data.
>
> I think this is only partly true. A casual glance at the -general list will
> reveal all manner of false assumptions on the parts of administrators about
> how their data is structured. My experience is that, given that the
> computer has way more information about the data than I do, it is more
> likely to make the right choice. To the extent it doesn't do so, that's a
> problem in the planning (or whatever) algorithms, and it ought to be fixed
> there.

Well, Postgres doesn't automatically create indices, for a counter example.

With regard to partitioning over multiple table spaces, I think the DBA
definitely has more information available, than the computer. A DBA
(hopefully) knows future plans and emergency strategies for the storage
system, for example. Lacking such information, the database system will
have a pretty hard time taking a good decision on how to partition
between table spaces, IMO.

Regards

Markus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2008-01-07 18:28:49 Re: Bug: Unreferenced temp tables disables vacuum to update xid
Previous Message Joshua D. Drake 2008-01-07 17:27:53 Re: VACUUM FULL out of memory