Re: performance of IN (subquery)

From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: performance of IN (subquery)
Date: 2004-08-27 03:33:35
Message-ID: 87vff5ntkg.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


"Arthur Ward" <award-postgresql(at)dominionsciences(dot)com> writes:

> Any hackers around who can say why this might be a bad idea, or is it one
> of those things that just needs a volunteer? (I'm not; at least not now.)

a) that would make plans change spontaneously. I hate being paged in the
middle of the night because some query is suddenly being slow when it had been
performing fine before.

b) Not all sequential scans will actually complete the scan. There could be a
limit imposed or a the sequential scan could be inside a EXISTS. In that case
the scan could be aborted at any point.

What I do think would be easy to do would be to keep statistics on the expense
of various components of the cost estimates. cpu_*_cost, random_page_cost
effective_cache_size, ought to be values that can be solved for empirically
from the timing results.

But that still doesn't have to be done on every query. There's a trade-off
between work done on every query to plan queries and the benefit. Gathering
statistics and storing them on every sequential scan is way too much work
slowing down every query for minimal gain.

--
greg

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-08-27 03:49:46 Re: performance of IN (subquery)
Previous Message Joel 2004-08-27 02:51:29 Re: UTF-8 and LIKE vs =