Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>
Cc: lfischer <l(dot)fischer(at)ed(dot)ac(dot)uk>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Igor Neyman <ineyman(at)perceptron(dot)com>, Madu Sudanan <b(dot)n(dot)madusudanan(at)gmail(dot)com>
Subject: Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Date: 2016-09-29 18:03:16
Message-ID: CAMkU=1xkOSS6V=K4w=pcuUmV5Yz81=va6BpZeMyknGS4Sx9woA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 22, 2016 at 11:35 PM, Sven R. Kunze <srkunze(at)mail(dot)de> wrote:

> Thanks a lot Madusudanan, Igor, Lutz and Jeff for your suggestions.
>
> What I can confirm is that the UNION ideas runs extremely fast (don't have
> access to the db right now to test the subquery idea, but will check next
> week as I travel right now). Thanks again! :)
>
>
> I was wondering: would it be possible for PostgreSQL to rewrite the query
> to generate the UNION (or subquery plan if it's also fast) on it's own?
>

I don't know what the subquery plan is, I don't see references to that in
the email chain.

I don't believe that current versions of PostgreSQL are capable of
rewriting the plan in the style of a union. It is not just a matter of
tweaking the cost estimates, it simply never considers such a plan in the
first place given the text of your query.

Perhaps some future version of PostgreSQL could do so, but my gut feeling
is that that is not very likely. It would take a lot of work, would risk
breaking or slowing down other things, and is probably too much of a niche
issue to attract a lot of interest.

Why not just use the union? Are you using a framework which generates the
query automatically and you have no control over it? Or do you just think
it is ugly or fragile for some other reason?

Perhaps moving the union from the outside to the inside would be more
suitable? That way teh select list is only specified once, and if you AND
more clauses into the WHERE condition they also only need to be specified
once.

SELECT * FROM big_table
WHERE
id in (SELECT big_table_id FROM table_a WHERE "table_a"."item_id" IN
(<handful of items>) union
SELECT big_table_id FROM table_a WHERE "table_b"."item_id" IN
(<handful of items>)
);

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Pavel Stehule 2016-09-29 18:12:58 Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Previous Message Tom Lane 2016-09-29 13:09:16 Re: Failing Multi-Job Restores, Missing Indexes on Restore