Re: maintain_cluster_order_v5.patch

From: "phb07(at)apra(dot)asso(dot)fr" <phb07(at)apra(dot)asso(dot)fr>
To: "pgsql-performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: maintain_cluster_order_v5.patch
Date: 2009-10-21 17:55:18
Message-ID: 20091021175518.58A7C4B020E@smtp2-g21.free.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

>> Hi all,
>>
>> The current discussion about "Indexes on low cardinality columns" let
>> me discover this
>> "grouped index tuples" patch (http://community.enterprisedb.com/git/)
>> and its associated
>> "maintain cluster order" patch
>> (http://community.enterprisedb.com/git/maintain_cluster_order_v5.patch)
>>
>> This last patch seems to cover the TODO item named "Automatically
>> maintain clustering on a table".
>
>The TODO item isn't clear about whether the order should be strictly
>maintained, or whether it should just make an effort to keep the table
>mostly clustered. The patch mentioned above makes an effort, but does
>not guarantee cluster order.
>
You are right, there are 2 different visions : a strictly maintained order or a possibly maintained order.
This later is already a good enhancement as it largely decrease the time interval between 2 CLUSTER operations, in particular if the FILLFACTOR is properly set. In term of performance, having 99% of rows in the "right" page is not realy worse than having totaly optimized storage.
The only benefit of a strictly maintained order is that there is no need for CLUSTER at all, which could be very interesting for very large databases with 24/24 access constraint.
For our need, the "possibly maintained order" is enough.

>> As this patch is not so new (2007), I would like to know why it has
>> not been yet integrated in a standart version of PG (not well
>> finalized ? not totaly sure ? not corresponding to the way the core
>> team would like to address this item ?) and if there are good chance
>> to see it committed in a near future.
>
>Search the archives on -hackers for discussion. I don't think either of
>these features were rejected, but some of the work and benchmarking have
>not been completed.
OK, I will have a look.
>
>If you can help (either benchmark work or C coding), try reviving the
>features by testing them and merging them with the current tree.
OK, that's the rule of the game in such a community.
I am not a good C writer, but I will see what I could do.

> I recommend reading the discussion first, to see if there are any major
>problems.

>
>Personally, I'd like to see the GIT feature finished as well. When I
>have time, I was planning to take a look into it.
>
>Regards,
> Jeff Davis

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jesper Krogh 2009-10-21 17:58:34 Re: Random penalties on GIN index updates?
Previous Message William Blunn 2009-10-21 17:26:16 Are unreferenced TOASTed values retrieved?