Re: Query became very slow after 9.6 -> 10 upgrade

From: Dmitry Shalashov <skaurus(at)gmail(dot)com>
To: Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query became very slow after 9.6 -> 10 upgrade
Date: 2017-11-22 14:51:22
Message-ID: CAKPeCUH9S7-Vz9SY4mjVxQ_4pHRvosVV+BZZXCkU3UUAykjOnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

I believe that with SSD disks random_page_cost should be very cheap, but
here you go (I decided to settle on EXPLAIN without ANALYZE this time, is
this is good enough?):

Sort (cost=18410.26..18410.27 rows=1 width=63)
Sort Key: (sum(st.shows)) DESC
CTE a
-> Index Scan using adroom_active_idx on adroom (cost=0.28..301.85
rows=1 width=233)
Index Cond: ((CURRENT_TIMESTAMP >= start_ts) AND
(CURRENT_TIMESTAMP <= stop_ts))
Filter: (((groups -> 0) ->> 'provider'::text) ~
'^target_mail_ru'::text)
CTE b
-> HashAggregate (cost=1.28..1.29 rows=1 width=40)
Group Key: a.provider, a.id, unnest(a.domain_ids)
-> ProjectSet (cost=0.00..0.53 rows=100 width=40)
-> CTE Scan on a (cost=0.00..0.02 rows=1 width=68)
-> GroupAggregate (cost=18107.09..18107.11 rows=1 width=63)
Group Key: b.provider, d.domain
-> Sort (cost=18107.09..18107.09 rows=1 width=55)
Sort Key: b.provider, d.domain
-> Nested Loop (cost=1.00..18107.08 rows=1 width=55)
Join Filter: ((b.id = st.adroom_id) AND (b.domain_id =
st.domain_id))
-> Nested Loop (cost=0.42..8.46 rows=1 width=59)
-> CTE Scan on b (cost=0.00..0.02 rows=1
width=40)
-> Index Scan using domains_pkey on domains d
(cost=0.42..8.44 rows=1 width=19)
Index Cond: (id = b.domain_id)
-> Index Scan using
adroom_stat_day_adroom_id_domain_id_url_id_is_wlabp_idx on adroom_stat st
(cost=0.58..180
91.26 rows=491 width=16)
Index Cond: ((day >= date_trunc('day'::text,
(CURRENT_TIMESTAMP - '7 days'::interval))) AND (day <=
date_trunc('day'::text, CURRENT_TIMESTAMP)) AND (domain_id = d.id))

Dmitry Shalashov, relap.io & surfingbird.ru

2017-11-22 17:44 GMT+03:00 Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>:

> Here is my select right after initdb:
>
>
>
> postgres=# select name,setting from pg_settings where name like '%_cost';
>
> name | setting
>
> ----------------------+---------
>
> cpu_index_tuple_cost | 0.005
>
> cpu_operator_cost | 0.0025
>
> cpu_tuple_cost | 0.01
>
> parallel_setup_cost | 1000
>
> parallel_tuple_cost | 0.1
>
> random_page_cost | 4
>
> seq_page_cost | 1
>
>
>
>
>
> Can you generate plan with random_page_cost = 4?
>
>
>
>
>
> --
>
> Alex Ignatov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
> *From:* Dmitry Shalashov [mailto:skaurus(at)gmail(dot)com]
> *Sent:* Wednesday, November 22, 2017 5:29 PM
> *To:* Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>
> *Cc:* pgsql-performance(at)postgresql(dot)org
> *Subject:* Re: Query became very slow after 9.6 -> 10 upgrade
>
>
>
> Sure, here it goes:
>
>
>
> name | setting
>
> ----------------------+---------
>
> cpu_index_tuple_cost | 0.005
>
> cpu_operator_cost | 0.0025
>
> cpu_tuple_cost | 0.01
>
> parallel_setup_cost | 1000
>
> parallel_tuple_cost | 0.1
>
> random_page_cost | 1
>
> seq_page_cost | 1
>
>
>
>
> Dmitry Shalashov, relap.io & surfingbird.ru
>
>
>
> 2017-11-22 17:24 GMT+03:00 Alex Ignatov <a(dot)ignatov(at)postgrespro(dot)ru>:
>
> Hello!
>
> What about :
>
> select name,setting from pg_settings where name like '%_cost';
>
>
>
> --
>
> Alex Ignatov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
>
>
> *From:* Dmitry Shalashov [mailto:skaurus(at)gmail(dot)com]
> *Sent:* Wednesday, November 22, 2017 5:14 PM
> *To:* pgsql-performance(at)postgresql(dot)org
> *Subject:* Query became very slow after 9.6 -> 10 upgrade
>
>
>
> Hi!
>
>
>
> I've seen few letters like this on mailing list and for some reason
> thought that probably it won't happen to us, but here I am lol.
>
>
>
> It's "nestloop hits again" situation.
>
>
>
> I'll try to provide plan from 9.6 later, but right now I have only plan
> from 10.1.
>
>
>
> Query: https://pastebin.com/9b953tT7
>
> It was running under 3 seconds (it's our default timeout) and now it runs
> for 12 minutes.
>
>
>
> \d adroom: https://pastebin.com/vBrPGtxT (3800 rows)
>
> \d adroom_stat: https://pastebin.com/CkBArCC9 (47mln rows, 1.5mln satisfy
> condition on day column)
>
> \d domains: https://pastebin.com/65hk7YCm (73000 rows)
>
>
>
> All three tables are analyzed.
>
>
>
> EXPLAIN ANALYZE: https://pastebin.com/PenHEgf0
>
> EXPLAIN ANALYZE with nestloop off: https://pastebin.com/zX35CPCV (0.8s)
>
>
>
> Regarding server parameters - it's a mighty beast with 2x E5-2630 v3,
> 192Gb of RAM and two very, very fast NVME server class SSD's in RAID1.
>
>
>
> What can I do with it?
>
>
>
>
>
> Also maybe this will be useful:
>
>
>
> 1st query, runs under 1ms
>
> select title, id, groups->0->>'provider' provider, domain_ids from adroom
> where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and
> current_timestamp between start_ts and stop_ts
>
>
>
> 2nd query that uses 1st one, runs under 3 ms
>
> select distinct unnest(domain_ids) FROM (select title, id,
> groups->0->>'provider' provider, domain_ids from adroom where
> groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and
> current_timestamp between start_ts and stop_ts) t1
>
>
>
> 3rd query which returns 1.5mln rows, runs in about 0.6s
>
> SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day
> between date_trunc('day', current_timestamp - interval '1 week') and
> date_trunc('day', current_timestamp)
>
>
>
> BUT if I'll add to 3rd query one additional condition, which is basically
> 2nd query, it will ran same 12 minutes:
>
> SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day
> between date_trunc('day', current_timestamp - interval '1 week') and
> date_trunc('day', current_timestamp) AND domain_id IN (select distinct
> unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider,
> domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and
> not is_paused and current_timestamp between start_ts and stop_ts) t1)
>
>
>
> Plan of last query:
>
> Nested Loop (cost=88.63..25617.31 rows=491 width=16) (actual
> time=3.512..733248.271 rows=1442797 loops=1)
>
> -> HashAggregate (cost=88.06..88.07 rows=1 width=4) (actual
> time=3.380..13.561 rows=3043 loops=1)
>
> Group Key: (unnest(adroom.domain_ids))
>
> -> HashAggregate (cost=88.03..88.04 rows=1 width=4) (actual
> time=2.199..2.607 rows=3043 loops=1)
>
> Group Key: unnest(adroom.domain_ids)
>
> -> ProjectSet (cost=0.28..87.78 rows=100 width=4) (actual
> time=0.701..1.339 rows=3173 loops=1)
>
> -> Index Scan using adroom_active_idx on adroom
> (cost=0.28..87.27 rows=1 width=167) (actual time=0.688..1.040 rows=4
> loops=1)
>
> Index Cond: ((CURRENT_TIMESTAMP >= start_ts)
> AND (CURRENT_TIMESTAMP <= stop_ts))
>
> Filter: (((groups -> 0) ->> 'provider'::text) ~
> '^target_mail_ru'::text)
>
> Rows Removed by Filter: 41
>
> -> Index Scan using adroom_stat_day_adroom_id_domain_id_url_id_is_wlabp_idx
> on adroom_stat (cost=0.58..25524.33 rows=491 width=16) (actual
> time=104.847..240.846 rows=474 loops=3043)
>
> Index Cond: ((day >= date_trunc('day'::text, (CURRENT_TIMESTAMP -
> '7 days'::interval))) AND (day <= date_trunc('day'::text,
> CURRENT_TIMESTAMP)) AND (domain_id = (unnest(adroom.domain_ids))))
>
> Planning time: 1.580 ms
>
> Execution time: 733331.740 ms
>
>
>
> Dmitry Shalashov, relap.io & surfingbird.ru
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-11-22 15:07:26 Re: Query became very slow after 9.6 -> 10 upgrade
Previous Message Alex Ignatov 2017-11-22 14:44:18 RE: Query became very slow after 9.6 -> 10 upgrade

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2017-11-22 14:57:13 Re: Bad estimates (DEFAULT_UNK_SEL)
Previous Message Don Seiler 2017-11-22 14:49:04 Re: Bad estimates