From: | Thomas Lockhart <lockhart(at)alumni(dot)caltech(dot)edu> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>, Zeugswetter Andreas IZ5 <Andreas(dot)Zeugswetter(at)telecom(dot)at>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple se lect) |
Date: | 1999-07-29 04:48:19 |
Message-ID: | 379FDD13.B0E7D22A@alumni.caltech.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
>
> Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us> writes:
> >> BTW, this argument proves rigorously that the selectivity of a search
> >> for any value other than the MFOV is not more than 0.5, so there is some
> >> basis for my intuition that eqsel should not return a value above 0.5.
> >> So, in the cases where eqsel does not know the exact value being
> >> searched for, I'd still be inclined to cap its result at 0.5.
>
> > I don't follow this. If the most frequent value occurs 95% of the time,
> > wouldn't the selectivity be 0.95?
>
> If you are searching for the most frequent value, then the selectivity
> estimate should indeed be 0.95. If you are searching for anything else,
> the selectivity estimate ought to be 0.05 or less. If you don't know
> what value you will be searching for, which number should you use?
>
> The unsupported assumption here is that if the table contains 95%
> occurrence of a particular value, then the odds are also 95% (or at
> least high) that that's the value you are searching for in any given
> query that has an "= something" WHERE qual.
>
> That assumption is pretty reasonable in some cases (such as your
> example earlier of "WHERE state = 'PA'" in a Pennsylvania-local
> database), but it falls down badly in others, such as where the
> most common value is NULL or an empty string or some other indication
> that there's no useful data. In that sort of situation it's actually
> pretty unlikely that the user will be searching for field =
> most-common-value ... but the system probably has no way to know that.
This is exactly what a partial index is supposed to do. And then the
system knows it...
- Thomas
--
Thomas Lockhart lockhart(at)alumni(dot)caltech(dot)edu
South Pasadena, California
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 1999-07-29 05:09:14 | Re: Selectivity of "=" (Re: [HACKERS] Index not used on simple se lect) |
Previous Message | Ross J. Reedstrom | 1999-07-29 03:54:11 | Re: [HACKERS] pg_dump not dumping all tables |