Re: IN vs EXISTS equivalence

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: IN vs EXISTS equivalence
Date: 2008-08-11 19:15:08
Message-ID: 48A0496C.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Our Internet connectivity failed as this was being sent. It looks
like at least the list didn't get it, so here goes another try.
Apologies for any duplication.

-Kevin


>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I chewed on that for awhile. We can certainly optimize EXISTS
that's
> appearing in the ON-condition of a regular JOIN, because that's not
> really semantically different from a WHERE-condition.

Good to hear. I thought that might be doable. :-)

> But I don't think
> it's going to be reasonable to improve EXISTS in outer-JOIN ON
> conditions. There are a couple of problems. Consider

The discussion did make the difficulties clear.

> So this is not something I'm going to tackle; at least not this
> devel cycle.

Fair enough.

> One small step we can take in this direction, though, is to improve
the
> planner's internal handling of the qual conditions for IN and
EXISTS.
> Right now the process is just to throw the sub-select into the main
> range table and put the IN join conditions into the same place in
WHERE
> that the IN-clause was to start with. The trouble with this is that
the
> distribute_quals_to_rels processing has no idea that there's
anything
> special about the IN join conditions. We got away with that for the
> limited case of IN clauses at the top level of WHERE, but it's
become
> clear to me over the weekend that this has no hope of working for
NOT
> EXISTS --- since that's effectively an outer join, it absolutely has
to
> have the same kinds of qual-scheduling constraints as ordinary outer
> joins do. So we need a data structure that distribute_quals_to_rels
can
> work with. What I think needs to happen is that the initial pass
that
> pulls up optimizable IN/EXISTS sub-selects should not merge the
> SubLink's replacement qual clauses seamlessly, but put them
underneath a
> new node type, say "FlattenedSubLink", that retains knowledge of the
> join it's representing. The FlattenedSubLink would survive only as
far
> as distribute_quals_to_rels, which would distribute out the
contained
> qual conditions instead of the FlattenedSubLink itself --- but only
> after marking them properly for the outer-join restrictions. This
> representation would make it feasible to work with IN/EXISTS that
are
> inside JOIN ON conditions, which the present representation using a
> single in_info_list really can't do. The semantic issues are still
> there but at least the representation isn't getting in the way ...

Just curious, is that something for this cycle, or a TODO item?

Thanks for looking at this. The one part I'm not sure about is where
the CASE/EXISTS in the SELECT value list fits into this discussion.
It seems conceptually similar to the OUTER JOIN, but sort of a special
case, so I'm not sure what you had in mind there.

-Kevin

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaime Casanova 2008-08-11 19:54:29 Re: Column level privileges was:(Re: Extending grant insert on tables to sequences)
Previous Message Peter Eisentraut 2008-08-11 18:40:13 Re: proposal: UTF8 to_ascii function