Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Date: 2016-09-29 20:35:33
Message-ID: CAMkU=1yH_o8fSFTj=2zOOv=cZsq2Narhk6jD5DAu3+NgFbuJHg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 29, 2016 at 11:12 AM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
wrote:

>
>
> 2016-09-29 14:20 GMT+02:00 Sven R. Kunze <srkunze(at)mail(dot)de>:
>
>> On 23.09.2016 11:00, Pavel Stehule wrote:
>>
>> 2016-09-23 8:35 GMT+02:00 Sven R. Kunze <srkunze(at)mail(dot)de>:
>>
>>> I was wondering: would it be possible for PostgreSQL to rewrite the
>>> query to generate the UNION (or subquery plan if it's also fast) on it's
>>> own?
>>>
>>
>> It depends on real data. On your specific data the UNION variant is
>> pretty fast, on different set, the UNION can be pretty slow. It is related
>> to difficult OR predicate estimation.
>>
>>
>> I figure that the UNION is fast if the sub-results are small (which they
>> are in our case). On the contrary, when they are huge, the OUTER JOIN
>> variant might be preferable.
>>
>>
>> Is there something I can do to help here?
>>
>> Or do you think it's naturally application-dependent and thus should be
>> solved with application logic just as we did?
>>
>
> In ideal world then plan should be independent on used form. The most
> difficult is safe estimation of OR predicates. With correct estimation the
> transformation to UNION form should not be necessary I am think.
>

I don't think it is an estimation issue. If it were, the planner would
always choose the same inefficient plan (providing the join collapse
limits, etc. don't come into play, which I don't think they do here) for
all the different ways of writing the query.

Since that is not happening, the planner must not be able to prove that the
different queries are semantically identical to each other, which means
that it can't pick the other plan no matter how good the estimates look.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Sven R. Kunze 2016-09-30 11:22:44 Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause
Previous Message Jeff Janes 2016-09-29 20:26:09 Re: Multiple-Table-Spanning Joins with ORs in WHERE Clause