Re: huge disparities in =/IN/BETWEEN performance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: George Pavlov <gpavlov(at)mynewplace(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: huge disparities in =/IN/BETWEEN performance
Date: 2007-02-09 03:50:05
Message-ID: 14273.1170993005@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Alvaro Herrera <alvherre(at)commandprompt(dot)com> writes:
> I think the principle here is that the system is not gonna waste cycles
> on dumb queries. Supposedly, morphing "foo BETWEEN 10 and 10" into
> "foo=10" is not a trivial transformation, and it'd impose a planning
> cost on all non-dumb BETWEEN queries.

There's a datatype abstraction issue involved: what does it take to
prove that "x >= 10 AND x <= 10" is equivalent to "x = 10"? This
requires a nontrivial amount of knowledge about the operators involved.
We could probably do it for operators appearing in a btree operator
class, but as Alvaro says, it'd be cycles wasted for non-dumb queries.

As for the IN case, I think we do simplify "x IN (one-expression)" to
"x = one-expression", but "x IN (sub-select)" is a whole 'nother matter,
especially when you're comparing it to a case where one-expression is
a constant and so the planner can get good statistics about how many
rows are likely to match.

regards, tom lane

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Joe 2007-02-09 04:12:14 Re: huge disparities in =/IN/BETWEEN performance
Previous Message Alvaro Herrera 2007-02-09 02:14:14 Re: huge disparities in =/IN/BETWEEN performance