Re: TPC-H Q20 from 1 hour to 19 hours!

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TPC-H Q20 from 1 hour to 19 hours!
Date: 2017-05-25 11:06:18
Message-ID: 2396b54c-04a8-8c5a-87a1-53d79d10a732@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 5/25/17 6:03 AM, Robert Haas wrote:
> On Thu, Apr 6, 2017 at 4:37 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Which brings me to the slightly suspicious bit. On 9.5, there's no
>> difference between GROUP and GROUP+LIKE cases - the estimates are exactly
>> the same in both cases. This is true too, but only without the foreign key
>> between "partsupp" and "part", i.e. the two non-grouped relations in the
>> join. And what's more, the difference (1737 vs. 16) is pretty much exactly
>> 100x, which is the estimate for the LIKE condition.
>
> I don't follow this. How does the foreign key between partsupp and
> part change the selectivity of LIKE?
>
>> So it kinda seems 9.5 does not apply this condition for semi-joins, while
>>> =9.6 does that.
>

Well, get_foreign_key_join_selectivity() does handle restrictions when
calculating joinrel size estimate in calc_joinrel_size_estimate(), so
assuming there's some thinko it might easily cause this.

I haven't found any such thinko, but I don't dare to claim I fully
understand what the current version of get_foreign_key_join_selectivity
does :-/

> If 9.5 and prior are ignoring some of the quals, that's bad, but I
> don't think I understand from your analysis why
> 7fa93eec4e0c9c3e801e3c51aa4bae3a38aaa218 regressed anything.
>

It's been quite a bit of time since I looked into this, but I think my
main point was that it's hard to say it's a regression when both the old
and new estimates are so utterly wrong.

I mean, 9.5 estimates 160, 9.6 estimates 18. We might fix the post-9.6
estimate to return the same value as 9.5, and it might fix this
particular query with this particular scale. But in the end it's just
noise considering that the actual value is 120k (so 3 or 4 orders of
magnitude off).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2017-05-25 11:10:00 Re: Alter subscription..SET - NOTICE message is coming for table which is already removed
Previous Message tushar 2017-05-25 10:08:00 Alter subscription..SET - NOTICE message is coming for table which is already removed