Re: When does PostgreSQL collapse subqueries to join?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Sven R(dot) Kunze" <srkunze(at)tbz-pariv(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: When does PostgreSQL collapse subqueries to join?
Date: 2014-12-10 15:34:49
Message-ID: 19803.1418225689@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Sven R. Kunze" <srkunze(at)tbz-pariv(dot)de> writes:
> ################ Perfect Plan ###############
> We assume all our queries to be equivalent and therefore want PostgreSQL
> to re-plan the others to this one.

> explain analyze verbose select * from a where a.id in (select a.id from
> a inner join text_b b1 on (a.id=b1.a_id) inner join text_b b2 on
> (a.id=b2.a_id) where b1.x='x1' and b1.y='y1' and b2.x='x2' and b2.y='y2'
> order by a.date desc limit 20);

> [ ... other variant cases ... ]

> ################### Slow Subqueries ##########################
> Directly querying from the subqueries performs even worse.

> explain analyze verbose select * from a where a.id in (select
> text_b.a_id from text_b where text_b.x='x1' and text_b.y='y1') and a.id
> in (select text_b.a_id from text_b where text_b.x='x2' and
> text_b.y='y2') order by a.date desc limit 20;

> What needs to be done in order to feed PostgreSQL with the last query
> and achieve the performance of the first one?

Postgres will *never* turn the last query into the first one, because
they are not in fact equivalent. Putting the ORDER BY/LIMIT inside the
subquery has entirely different effects than putting it outside. There's
no guarantee at all that the first query returns only 20 rows, nor that
the returned rows are in any particular order.

I'm a bit suspicious of the other aspect of your manual transformation
here too: in general semijoins (IN joins) don't commute with inner joins.
It's possible that it's okay here given the specific forms of the join
clauses, but the planner won't assume that.

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2014-12-10 16:52:13 Re: intel s3500 -- hot stuff
Previous Message Sven R. Kunze 2014-12-10 11:31:14 When does PostgreSQL collapse subqueries to join?