Re: performance of IN (subquery)

From: Jon Lapham <lapham(at)jandr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Paul Tillotson <pntil(at)shentel(dot)net>, Kevin Murphy <murphy(at)genome(dot)chop(dot)edu>, pgsql-general(at)postgresql(dot)org
Subject: Re: performance of IN (subquery)
Date: 2004-08-27 12:28:36
Message-ID: 412F28F4.4050106@jandr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Tom Lane wrote:
> I've thought about this before. One simple trick would be to get rid
> of the current pg_class reltuples/relpages fields in favor of a
> tuples-per-page estimate, which could be multiplied by
> RelationGetNumberOfBlocks() during planning. In the absence of any
> ANALYZE data the tuples-per-page estimate might be pretty bogus, but
> it couldn't be off by more than an order of magnitude or so either way.
> And in any case we'd have a guaranteed up-to-date number of blocks.
>
> The objections that could be raised to this are (AFAICS) two:
> [snip]
> 2. Instability of plans. Right now, the planner will not change plans
> underneath you --- you have to issue an explicit VACUUM or ANALYZE
> to change the terms of discussion. That would stop being true if
> physical file size were always taken into account. Maybe this is a
> problem, or maybe it isn't ... as someone who likes to be able to
> debug planner behavior without actually creating umpteen-gig test
> tables, my world view may be a bit skewed ...
>
> It's certainly doable if we decide the pluses outweigh the minuses.
> Thoughts?

My first reaction is to wonder if this would give performance exactly
equal to running a true ANALYZE in every situation? If not, then you
would end up with an automated pseudo-ANALYZE (performance-wise).

In my opinion, it is almost a feature that non-ANALYZE-d tables give
such horrendous performance, it kicks you in the butt to do some
thinking about when to correctly deal with ANALYZEing.

So, in short, I think it is a huge win if we could have automatic
ANALYZE with true ANALYZE performance, but a huge loss if the automatic
ANALYZE performance is not exactly as good as a true ANALYZE.

--
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
Jon Lapham <lapham(at)jandr(dot)org> Rio de Janeiro, Brasil
Personal: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Murphy 2004-08-27 12:40:42 Re: performance of IN (subquery)
Previous Message Joel 2004-08-27 12:27:27 Re: UTF-8 and LIKE vs =