Re: Patch to support SEMI and ANTI join removal

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to support SEMI and ANTI join removal
Date: 2014-08-26 13:40:57
Message-ID: 53FC8E69.9060301@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/26/2014 03:28 PM, David Rowley wrote:
> Any ideas or feedback on this would be welcome

Before someone spends time reviewing this patch, are you sure this is
worth the effort? It seems like very narrow use case to me. I understand
removing LEFT and INNER joins, but the case for SEMI and ANTI joins
seems a lot thinner. Unnecessary LEFT and INNER joins can easily creep
into a query when views are used, for example, but I can't imagine that
happening for a SEMI or ANTI join. Maybe I'm lacking imagination. If
someone has run into a query in the wild that would benefit from this,
please raise your hand.

If I understood correctly, you're planning to work on INNER join removal
too. How much of the code in this patch is also required for INNER join
removal, and how much is specific to SEMI and ANTI joins?

Just so everyone is on the same page on what kind of queries this helps
with, here are some examples from the added regression tests:

> -- Test join removals for semi and anti joins
> CREATE TEMP TABLE b (id INT NOT NULL PRIMARY KEY, val INT);
> CREATE TEMP TABLE a (id INT NOT NULL PRIMARY KEY, b_id INT REFERENCES b(id));
> -- should remove semi join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE b_id IN(SELECT id FROM b);
> QUERY PLAN
> ------------------------------
> Seq Scan on a
> Filter: (b_id IS NOT NULL)
> (2 rows)
>
> -- should remove semi join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
> QUERY PLAN
> ------------------------------
> Seq Scan on a
> Filter: (b_id IS NOT NULL)
> (2 rows)
>
> -- should remove anti join to b
> EXPLAIN (COSTS OFF)
> SELECT id FROM a WHERE NOT EXISTS(SELECT 1 FROM b WHERE a.b_id = id);
> QUERY PLAN
> --------------------------
> Seq Scan on a
> Filter: (b_id IS NULL)
> (2 rows)

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-08-26 13:54:59 Re: replicating DROP commands across servers
Previous Message Andres Freund 2014-08-26 13:28:19 Re: a7ae1dc has broken the windows builds