Selectivity estimation for equality and range queries

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Selectivity estimation for equality and range queries
Date: 2007-12-28 10:55:12
Message-ID: 200712281155.12674.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

I have been observing a case where the row count estimation for LIKE 'foo' is
(much) higher than for LIKE 'foo%', the rest of the query being the same.
This is a special case of the estimation for equality being higher than for a
range query that includes the value used in the equality.

I haven't been able to get a copy of the data from the client yet, but
considering the nature of the data and the description of the selectivity
estimation algorithms
(http://www.postgresql.org/docs/8.3/static/row-estimation-examples.html),
this behavior appears to be mathematically plausible. I have been wondering
whether in general the eqsel should try to compare its result with the
estimation of (x >= 'foo' AND x <= 'foo') and use that as a ceiling or
something.

Has anyone else observed something similar?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2007-12-28 13:44:15 Re: Spoofing as the postmaster
Previous Message Tom Lane 2007-12-28 04:25:33 Re: [HACKERS] Unworkable column delimiter characters for COPY