Thinking about IN/EXISTS optimization

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Thinking about IN/EXISTS optimization
Date: 2002-10-22 23:18:11
Message-ID: 26280.1035328691@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been thinking about how to convert "x IN (subselect)" and EXISTS
constructs into join-like processing, and I've run into a small problem
in getting the planner to do it nicely. The issue is that I need to
take the subselect and push it into the jointree --- essentially, make
it look like a subselect-in-FROM --- so that the join planner can deal
with it. Basically, I need to rearrange

SELECT ... FROM ... WHERE ... AND x IN (SELECT y FROM ...)

into

SELECT ... FROM ..., (SELECT y FROM ...) ss
WHERE ... AND x =* ss.y

where =* represents some specially-marked RestrictInfo node. (NOT IN is the
same except that the RestrictInfo node will be marked differently.)

The difficulty is that there's no good place to do this in
subquery_planner(). We should push the subselect into FROM before we
run the pull_up_subqueries() and preprocess_jointree() operations;
if we don't pull up the subselect into the main query then we won't have
accomplished very much. But the WHERE clause isn't simplified into a
form that makes it easy to spot top-level IN() expressions until after
that. We can't simply switch the order of the subselect and
WHERE-clause processing, because pulling up subqueries typically adds
conditions to the WHERE clause.

I haven't been able to think of a solution to this that doesn't involve
wasting a lot of cycles by repeating some of these processing steps,
or missing some optimization possibilities. (For example, if we pull up
a subquery that came from a view, it might contain an IN where-clause,
which ideally we'd want to be able to optimize. It almost seems like
we need to be able to loop around the whole operation; but most of the
time this will just waste cycles.)

Anyone see a nice way to do this?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2002-10-22 23:27:20 Re: Memory leaks
Previous Message Peter Eisentraut 2002-10-22 22:20:53 Re: pg_dump and large files - is this a problem?