Re: TPC-H Q20 from 1 hour to 19 hours!

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TPC-H Q20 from 1 hour to 19 hours!
Date: 2017-03-29 19:00:01
Message-ID: CA+TgmoZJCKAp70A8kFiq5ZfZAMiA99g0wN60sbqBVR+LdMz+4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 6, 2017 at 1:22 AM, Rafia Sabih
<rafia(dot)sabih(at)enterprisedb(dot)com> wrote:
> This is to bring to notice a peculiar instance I found recently while
> running TPC-H benchmark queries. Q20 of the benchmark took 19 hours to
> complete ...

That's bad.

> It is clear that selectivity estimations are really bad in this case
> particularly at node,
> -> Merge Join (cost=52959586.72..60024468.82 rows=85 width=16)
> (actual time=1525322.753..2419045.641 rows=1696742 loops=1)
> Merge Cond: ((lineitem.l_partkey =
> partsupp.ps_partkey) AND (lineitem.l_suppkey = partsupp.ps_suppkey))
> Join Filter:
> ((partsupp.ps_availqty)::numeric > ((0.5 * sum(lineitem.l_quantity))))
> Rows Removed by Join Filter: 3771

So, the selectivity estimation here is bad both before and after Tom's
commit, but it's significantly worse after (actual value 1696742, old
estimate 3771, new estimate 85).

> Still this puzzled me as during earlier runs of this benchmark I never
> encountered such prolonged running times. On further investigation I
> found that on reverting the commit
> 7fa93eec4e0c9c3e801e3c51aa4bae3a38aaa218
> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Date: Sat Dec 17 15:28:54 2016 -0500
> Fix FK-based join selectivity estimation for semi/antijoins.

I don't think the problem originates at the Merge Join, though,
because the commit says that at is fixing semi and anti-join estimates
- this is a plain inner join, so in theory it shouldn't change.
However, it's a bit hard for me to piece through these plans, the
formatting kind of got messed up - things are wrapped. Could you
possibly attach the plans as attachments?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-03-29 19:04:50 Re: Schedule and Release Management Team for PG10
Previous Message Alvaro Herrera 2017-03-29 18:59:03 Re: [PATCH] Reduce src/test/recovery verbosity