Re: I: About "Our CLUSTER implementation is pessimal" patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
Cc: Leonardo Francalanci <m_lists(at)yahoo(dot)it>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: I: About "Our CLUSTER implementation is pessimal" patch
Date: 2010-10-07 23:20:46
Message-ID: 8092.1286493646@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Kupershmidt <schmiddy(at)gmail(dot)com> writes:
> So I think there are definitely cases where this patch helps, but it
> looks like a seq. scan is being chosen in some cases where it doesn't
> help.

I've been poking through this patch, and have found two different ways
in which it underestimates the cost of the seqscan case:

* it's not setting rel->width, resulting in an underestimate of the
amount of disk space needed for a sort; this would get worse for wider
tables.

* it's not allowing for the cost of recomputing index expression values
during comparisons. That doesn't matter of course if you're not testing
the index-expression case (which other infelicities suggest hasn't
exactly been stressed yet).

I suspect the first of these might have something to do with your
observation. AFAIR the width value isn't used in estimating indexscan
cost, so this omission would bias it in favor of seqscans, as soon as
the data volume exceeded maintenance_work_mem.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2010-10-07 23:29:10 Re: O_DSYNC broken on MacOS X?
Previous Message Greg Smith 2010-10-07 23:15:00 Re: standby registration (was: is sync rep stalled?)