Skip site navigation (1) Skip section navigation (2)

Re: performance of IN (subquery)

From: Jon Lapham <lapham(at)jandr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Paul Tillotson <pntil(at)shentel(dot)net>,Kevin Murphy <murphy(at)genome(dot)chop(dot)edu>, pgsql-general(at)postgresql(dot)org
Subject: Re: performance of IN (subquery)
Date: 2004-08-27 12:28:36
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-general
Tom Lane wrote:
> I've thought about this before.  One simple trick would be to get rid
> of the current pg_class reltuples/relpages fields in favor of a
> tuples-per-page estimate, which could be multiplied by
> RelationGetNumberOfBlocks() during planning.  In the absence of any
> ANALYZE data the tuples-per-page estimate might be pretty bogus, but
> it couldn't be off by more than an order of magnitude or so either way.
> And in any case we'd have a guaranteed up-to-date number of blocks.
> The objections that could be raised to this are (AFAICS) two:
> [snip]
> 2. Instability of plans.  Right now, the planner will not change plans
>    underneath you --- you have to issue an explicit VACUUM or ANALYZE
>    to change the terms of discussion.  That would stop being true if
>    physical file size were always taken into account.  Maybe this is a
>    problem, or maybe it isn't ... as someone who likes to be able to
>    debug planner behavior without actually creating umpteen-gig test
>    tables, my world view may be a bit skewed ...
> It's certainly doable if we decide the pluses outweigh the minuses.
> Thoughts?

My first reaction is to wonder if this would give performance exactly 
equal to running a true ANALYZE in every situation?  If not, then you 
would end up with an automated pseudo-ANALYZE (performance-wise).

In my opinion, it is almost a feature that non-ANALYZE-d tables give 
such horrendous performance, it kicks you in the butt to do some 
thinking about when to correctly deal with ANALYZEing.

So, in short, I think it is a huge win if we could have automatic 
ANALYZE with true ANALYZE performance, but a huge loss if the automatic 
ANALYZE performance is not exactly as good as a true ANALYZE.

  Jon Lapham  <lapham(at)jandr(dot)org>                Rio de Janeiro, Brasil

In response to


pgsql-general by date

Next:From: Kevin MurphyDate: 2004-08-27 12:40:42
Subject: Re: performance of IN (subquery)
Previous:From: JoelDate: 2004-08-27 12:27:27
Subject: Re: UTF-8 and LIKE vs =

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group