Re: Transforming IN (...) to ORs, volatility

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Transforming IN (...) to ORs, volatility
Date: 2011-04-05 08:24:23
Message-ID: 4D9AD1B7.40700@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 02.04.2011 20:48, Tom Lane wrote:
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> We sometimes transform IN-clauses to a list of ORs:
>> postgres=# explain SELECT * FROM foo WHERE a IN (b, c);
>> QUERY PLAN
>> ------------------------------------------------------
>> Seq Scan on foo (cost=0.00..39.10 rows=19 width=12)
>> Filter: ((a = b) OR (a = c))
>> (2 rows)
>
>> But what if you replace "a" with a volatile function? It doesn't seem
>> legal to do that transformation in that case, but we do it:
>
> This is the fault of transformAExprIn(). But please let's *not* fix
> this by adding volatility to the set of heuristics used there. Looking
> at this again, it seems to me that most of the problem with this code
> is that we're trying to make optimization decisions in the parser.

Agreed. The history of this is that before 8.2 all IN clauses were
transformed to OR clauses straight in the grammar. 8.2 added the code to
represent IN clause as a ScalarArrayOpExpr, but it was changed in 8.2.10
to use the OR-form again for Vars
(http://archives.postgresql.org/pgsql-hackers/2008-10/msg01269.php)

> I think what we ought to do is have the parser emit a full-fledged
> InExpr node type (with semantics rather like CaseExpr) and then teach
> the planner to optimize that to something else when it seems
> safe/prudent to do so. One nontrivial advantage of that is that
> rules/views containing IN constructs would start to reverse-parse
> in the same fashion, instead of introducing weird substitute
> expressions.

Here's my first cut at that. The lefthand expression is now evaluated
only once, and stored in econtext->caseValue. Parse analysis turns the
righthand expressions into a list of comparison expressions like
"CaseTestExpr = value1". It's perhaps time that we rename CaseTestExpr
into something more generic, now that it's used not only in CASE
expressions, but also in IN and in UPDATE targets, but I didn't do that
in this patch.

eval_const_expressions checks the lefthand expression for volatile
functions. If there aren't any, it transform the InExprs to a list of ORs.

This isn't finished, because it doesn't yet do the transformation to
ScalarArrayOpExpr. The OR form is much slower to plan, which is why the
ScalarArrayOpExpr transformation was introduced in 8.2. I'll continue
hacking on that, but please let me know if you have a better idea on how
to handle that. One alternative is to teach the machinery that matches
restrictinfos to usable indexes to handle InExpr like it does
ScalarArrayOpExprs, but I don't know that code very well.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
fix-in-clause-2.patch text/x-diff 65.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2011-04-05 08:31:10 Re: Re: synchronous_commit and synchronous_replication Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Fujii Masao 2011-04-05 08:21:54 Re: Re: synchronous_commit and synchronous_replication Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.