Quick Links

Re: optimizing queries using IN and EXISTS

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Nick Hofstede <Nick(dot)Hofstede(at)inventivegroup(dot)com>
Cc:	"pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: optimizing queries using IN and EXISTS
Date:	2012-07-18 22:36:14
Message-ID:	3691.1342650974@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Nick Hofstede <Nick(dot)Hofstede(at)inventivegroup(dot)com> writes:
> I'm surprised at the difference in speed/execution plan between two logically equivalent queries, one using IN, the other using EXISTS. (At least I think they are logically equivalent)

> SELECT *
> FROM foo
> WHERE 'text6' IN (SELECT value
> FROM bar
> JOIN foo AS foo2
> ON bar.foo_ref = foo2.id
> WHERE foo2.id = foo.id)

Hm. convert_ANY_sublink_to_join() rejects subqueries that contain any
Vars of the parent query level, so the reference to foo.id prevents this
from being converted to a semijoin. However, it seems like that's
overly restrictive. I'm not sure that we could remove the test
altogether, but at least outer vars used in WHERE seem safe.

In the meantime, you can recast like this:

SELECT *
FROM foo
WHERE ('text6', id) IN (SELECT value, foo2.id
FROM bar
JOIN foo AS foo2
ON bar.foo_ref = foo2.id)

and still get a semijoin plan from an IN-style query.

regards, tom lane

In response to

optimizing queries using IN and EXISTS at 2012-07-18 16:10:30 from Nick Hofstede

Responses

Re: optimizing queries using IN and EXISTS at 2012-07-19 11:53:43 from Nick Hofstede

Browse pgsql-performance by date

	From	Date	Subject
Next Message	David Kerr	2012-07-19 01:43:28	Re: Process 11812 still waiting for ExclusiveLock on extension of relation
Previous Message	Simon Riggs	2012-07-18 21:45:08	Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)