Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] CLUSTER command

From: "Charles H(dot) Woloszynski" <chw(at)clearmetrix(dot)com>
To: Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl>
Cc: pgsql-performance(at)postgresql(dot)org, pgsql-general(at)postgresql(dot)org
Subject: Re: [PERFORM] CLUSTER command
Date: 2002-12-13 01:06:35
Message-ID: 3DF9329B.1020908@clearmetrix.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-interfacespgsql-performance
I think Oracle does something like this with its clustering.  You set a 
%fill and Oracle uses this when doing inserts into a segment and when to 
add a new one.  There is also some control over the grouping of data 
within a page.  I don't have an Oracle manual present, but I think the 
clustering works on a specific index.  

I agree that adding auto-clustering would be a very good thing and that 
we can learn about functionality by studying what other applications 
have already done and if/how those strategies were successful.

Charlie


Alvaro Herrera wrote:

>On Thu, Dec 12, 2002 at 04:03:47PM -0800, Stephan Szabo wrote:
>  
>
>>On Thu, 12 Dec 2002, johnnnnnn wrote:
>>
>>    
>>
>>>I think the code changes would be complicated. Just at a 30-second
>>>consideration, this would need to touch:
>>>- all sql (selects, inserts, updates, deletes)
>>>- vacuuming
>>>- indexing
>>>- statistics gathering
>>>- existing clustering
>>>      
>>>
>>I think his idea was to treat it similarly to the way that the
>>system treats tables >2G with .N files.  The only thing is that
>>I believe the code that deals with that wouldn't be particularly
>>easy to change to do it though, but I've only taken a cursory look at
>>what I think is the place that does that(storage/smgr/md.c). Some sort of
>>good partitioning system would be nice though.
>>    
>>
>
>I don't think this is doable without a huge amount of work.  The storage
>manager doesn't know anything about what is in a page, let alone a
>tuple.  And it shouldn't, IMHO.  Upper levels don't know how are pages
>organized in disk; they don't know about .1 segments and so on, and they
>shouldn't.
>
>I think this kind of partition doesn't buy too much.  I would really
>like to have some kind of auto-clustering, but it should be implemented
>in some upper level; e.g., by leaving some empty space in pages for
>future tuples, and arranging the whole heap again when it runs out of
>free space somewhere.  Note that this is very far from the storage
>manager.
>
>  
>

-- 


Charles H. Woloszynski

ClearMetrix, Inc.
115 Research Drive
Bethlehem, PA 18015

tel: 610-419-2210 x400
fax: 240-371-3256
web: www.clearmetrix.com






In response to

pgsql-performance by date

Next:From: Stephan SzaboDate: 2002-12-13 02:11:50
Subject: Re: [PERFORM] CLUSTER command
Previous:From: Alvaro HerreraDate: 2002-12-13 00:47:19
Subject: Re: [PERFORM] CLUSTER command

pgsql-interfaces by date

Next:From: Stephan SzaboDate: 2002-12-13 02:11:50
Subject: Re: [PERFORM] CLUSTER command
Previous:From: Alvaro HerreraDate: 2002-12-13 00:47:19
Subject: Re: [PERFORM] CLUSTER command

pgsql-general by date

Next:From: Medi MontaseriDate: 2002-12-13 01:24:55
Subject: statement timeout test case
Previous:From: Alvaro HerreraDate: 2002-12-13 00:47:19
Subject: Re: [PERFORM] CLUSTER command

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group