pgsql: Improve estimation of IN/NOT IN by assuming array elements are d

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-committers(at)postgresql(dot)org
Subject: pgsql: Improve estimation of IN/NOT IN by assuming array elements are d
Date: 2012-03-08 04:00:52
Message-ID: E1S5UX6-0002Uw-0I@gemulon.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

Improve estimation of IN/NOT IN by assuming array elements are distinct.

In constructs such as "x IN (1,2,3,4)" and "x <> ALL(ARRAY[1,2,3,4])",
we formerly always used a general-purpose assumption that the probability
of success is independent for each comparison of "x" to an array element.
But in real-world usage of these constructs, that's a pretty poor
assumption; it's much saner to assume that the array elements are distinct
and so the match probabilities are disjoint. Apply that assumption if the
operator appears to behave as equality (for ANY) or inequality (for ALL).
But fall back to the normal independent-probabilities calculation if this
yields an impossible result, ie probability > 1 or < 0. We could protect
ourselves against bad estimates even more by explicitly checking for equal
array elements, but that is expensive and doesn't seem worthwhile: doing
it would amount to optimizing for poorly-written queries at the expense
of well-written ones.

Daniele Varrazzo and Tom Lane, after a suggestion by Ants Aasma

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/66a7e6bae98592d1d98d9ef589753f0e953c5828

Modified Files
--------------
src/backend/utils/adt/selfuncs.c | 74 ++++++++++++++++++++++++++++++++++++--
1 files changed, 71 insertions(+), 3 deletions(-)

Browse pgsql-committers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-03-08 09:14:26 pgsql: Silence warning about unused variable, when building without ass
Previous Message Tom Lane 2012-03-08 00:26:07 pgsql: Fix indentation of \d footers for non-ASCII cases.