Quick Links

I: About "Our CLUSTER implementation is pessimal" patch

From:	Leonardo F <m_lists(at)yahoo(dot)it>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	I: About "Our CLUSTER implementation is pessimal" patch
Date:	2010-02-09 10:49:23
Message-ID:	82098.61775.qm@web29008.mail.ird.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Not even a comment? As I said, performance results on my system
were very good....

> I know you're all very busy getting 9.0 out, but I think the results in
> heap scanning + sort instead of index scanning for CLUSTER are
> very good... I would like to know if I did something wrong/I should
> improve something in the patch... I haven't tested it with index
> expressions yet (but the tests in "make check" work).
>
> Thanks
>
> Leonardo
>
>
> > Hi all,
> >
> > attached a patch to do seq scan + sorting instead of index scan
> >
> > on CLUSTER (when that's supposed to be faster).
> >
> > As I've already said, the patch is based on:
> > http://archives.postgresql.org/pgsql-hackers/2008-08/msg01371.php
> >
> > Of course, the code isn't supposed to be ready to be merged: I
> > would like to write more comments and add some test cases to
> > cluster.sql (plus change all the things you are going to tell me I
> > have to change...)
> >
> > I would like your opinions on code correctness and the decisions
> > I took, especially:
> >
> > 1) function names ("cost_index_scan_vs_seqscansort" I guess
> > it's awful...)
> >
> > 2) the fact that I put in Tuplesortstate an EState variable, so that
> > MakeSingleTupleTableSlot wouldn't have to be called for every
> > row in the expression indexes case
> >
> > 3) the expression index case is not "optimized": I preferred to
> > call FormIndexDatum once for the first key value in
> > copytup_rawheap and another time to get all the remaining values
> > in comparetup_rawheap. I liked the idea of re-using
> > FormIndexDatum in that case, instead of copying&pasting only
> > the relevant code: but FormIndexDatum returns all the values,
> >
> > even when I might need only the first one
> >
> >
> > 4) I refactored the code to deform and rewrite tuple into the function
> > "deform_and_rewrite_tuple", because now that part can be called
> > by the regular index scan or by the new seq-scan + sort (but I
> > could copy&paste those lines instead of refactoring them into a new
> > function)
> >
> > Suggestions and comments are not just welcome, but needed!

Attachment	Content-Type	Size
sorted_cluster.patch	application/octet-stream	26.2 KB

In response to

Re: About "Our CLUSTER implementation is pessimal" patch at 2010-01-28 11:54:21 from Leonardo F

Responses

Re: I: About "Our CLUSTER implementation is pessimal" patch at 2010-02-09 16:16:17 from Josh Kupershmidt

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Takahiro Itagaki	2010-02-09 11:16:07	Re: Largeobject Access Controls (r2460)
Previous Message	Thom Brown	2010-02-09 10:22:50	Re: Streaming replication in docs