Re: Query became very slow after 9.6 -> 10 upgrade

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dmitry Shalashov <skaurus(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query became very slow after 9.6 -> 10 upgrade
Date: 2017-11-22 15:19:23
Message-ID: 13426.1511363963@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Dmitry Shalashov <skaurus(at)gmail(dot)com> writes:
> BUT if I'll add to 3rd query one additional condition, which is basically
> 2nd query, it will ran same 12 minutes:
> SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day
> between date_trunc('day', current_timestamp - interval '1 week') and
> date_trunc('day', current_timestamp) AND domain_id IN (select distinct
> unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider,
> domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and
> not is_paused and current_timestamp between start_ts and stop_ts) t1)

> Plan of last query:
> Nested Loop (cost=88.63..25617.31 rows=491 width=16) (actual
> time=3.512..733248.271 rows=1442797 loops=1)
> -> HashAggregate (cost=88.06..88.07 rows=1 width=4) (actual
> time=3.380..13.561 rows=3043 loops=1)
> Group Key: (unnest(adroom.domain_ids))
> -> HashAggregate (cost=88.03..88.04 rows=1 width=4) (actual
> time=2.199..2.607 rows=3043 loops=1)
> Group Key: unnest(adroom.domain_ids)
> -> ProjectSet (cost=0.28..87.78 rows=100 width=4) (actual
> time=0.701..1.339 rows=3173 loops=1)

Hm, seems like the problem is that that lower HashAggregate is estimated
as having only one row out, which is way off and doesn't sound like a
particularly bright default estimate anyway. (And then we're doing an
additional HashAggregate on top of that, which is useless --- implies
that something isn't realizing that the output of the SELECT DISTINCT
is already distinct.)

I'm suspicious that this is breakage from the work that was done on
targetlist SRFs in v10, but that's just a guess at this point.

Trying simple test queries involving WHERE x IN (SELECT DISTINCT
unnest(foo) FROM ...), I do not see a behavior like this, so there is some
not-very-obvious contributing factor in your situation. Can you put
together a self-contained test case that produces a bogus one-row
estimate? Extra points if it produces duplicate HashAgg steps.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Artur ZajÄ…c 2017-11-22 16:52:30 RE: Bad estimates
Previous Message Laurenz Albe 2017-11-22 15:09:54 Re: Bad estimates

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2017-11-22 15:23:36 Re: [HACKERS] Bug in to_timestamp().
Previous Message Tomas Vondra 2017-11-22 15:07:26 Re: Query became very slow after 9.6 -> 10 upgrade