Re: I: About "Our CLUSTER implementation is pessimal" patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: Leonardo Francalanci <m_lists(at)yahoo(dot)it>, Josh Kupershmidt <schmiddy(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch
Date: 2010-10-08 00:10:52
Message-ID: 8746.1286496652@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com> writes:
> I re-ordered some description in the doc. Does it look better?
> Comments and suggestions welcome.

Applied with some significant editorialization. The biggest problem I
found was that the code for expression indexes didn't really work, and
would leak memory like there's no tomorrow even when it did work.
I fixed that, but I think the performance is still going to be pretty
undesirable. We have to re-evaluate the index expressions for each
tuple each time we do a comparison, which means it's going to be really
really slow unless the index expressions are *very* cheap. But perhaps
the use-case for clustering on expression indexes is small enough that
this isn't worth worrying about.

I considered computing the index expressions just once as the data is
being fed in, and including their values in the tuples-to-be-sorted;
that would cut the number of times the values have to be computed by
a factor of about log N. But it'd also bloat the on-disk sort data,
which could possibly cost more in I/O than we save. So it's not real
clear what to do anyway.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-10-08 00:33:03 Re: I: About "Our CLUSTER implementation is pessimal" patch
Previous Message Greg Smith 2010-10-07 23:44:27 Re: Issues with Quorum Commit