Re: SQL command speed

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kate Collins <klcollins(at)wsicorp(dot)com>
Cc: postgresql news group <pgsql-sql(at)postgresql(dot)org>
Subject: Re: SQL command speed
Date: 2000-05-19 21:39:01
Message-ID: 9901.958772341@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Kate Collins <klcollins(at)wsicorp(dot)com> writes:
> You are correct in your assessment of the nature of the distribution
> of the elements in the item_a column. Some values return none or a
> few rows. While other values return many rows (100 or more). It is
> not an even distribution. I had never really considered what effect
> this would have on searching before.

Apparently you have at least one value with close to 2000 rows (IIRC,
the rows estimate is basically half the MCV's row count, except in
the special case where the planner can see that you are comparing
against the MCV itself).

If that's a real data value and not a dummy, then of course you can't
replace it by NULL. In that case I'm kind of stuck for a good answer
for 7.0; we can't do much to distinguish the few-distinct-values case
from the many-distinct-values-with-some-frequency-outliers case until
we have better statistics than 7.0 keeps. What you could do as a
stopgap is to force the system to use an indexscan against its
judgment, by doing this before issuing the problem query:

SET enable_seqscan = OFF;

Naturally I recommend keeping this ON most of the time, but sometimes
you just have to get out the blunt instruments ;-)

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Peter Eisentraut 2000-05-20 13:35:58 Re: Question about databases in alternate locations...
Previous Message Tom Lane 2000-05-19 19:48:46 Re: Creating Tables in Functions