From: | "Andrew Zakharov" <Andrew898(at)mail(dot)ru> |
---|---|
To: | <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Wrong estimations and NL Anti join poor performance |
Date: | 2019-11-18 17:35:29 |
Message-ID: | 044a01d59e36$920e7020$b62b5060$@mail.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello colleagues -
The problem description:
We're moving from 9.6 to 11.5. There is a SQL code that never ends in 11.5
but works fine in 9.6. The main cause is the optimizer considers of using NL
Anti join instead of Merge in 9.6. And the root cause - wrong estimation
while self-joining.
System environment:
CentOS Linux 3.10.0-1062.4.1.el7.x86_64 x86_64
MemTotal: 16266644 kB
Intel(R) Xeon(R) CPU E7-8867 v3 @ 2.50GHz
HDD - unknown
PostgreSQL:
PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623
(Red Hat 4.8.5-36), 64-bit
shared_buffers = 1GB
huge_pages = on
temp_buffers = 1GB
max_prepared_transactions = 128
max_connections = 500
work_mem = 256MB
maintenance_work_mem = 1024MB
autovacuum_work_mem = 512MB
max_worker_processes = 100
max_parallel_workers_per_gather = 0 # changing this value to any others
takes no effect for issue resolving
max_parallel_workers = 8
checkpoint_timeout = 30min
max_wal_size = 32GB
min_wal_size = 8GB
checkpoint_completion_target = 0.9
enable_nestloop = on # off value fixes the issue but this is
wrong way
random_page_cost = 4.0
effective_cache_size = 4GB
default_statistics_target = 2000
Main script:
-- preparation
-- this table is reverted tree with tree keys position_uuid,
parent_position_uuid
create temporary table tmp_nsi_klp on commit drop as
select
k1.gid,
k1.smnn_gid,
k1.position_uuid,
p.parent_position_uuid,
k1.order_number,
k1.date_start,
k1.date_end,
k1.is_active,
coalesce(p.is_fake_series, false) as is_fake_series
from
nsi_klp k1
left join (select gid, unnest(parent_position_uuid) as
parent_position_uuid, coalesce(array_length(parent_position_uuid, 1),0) > 1
as is_fake_series from nsi_klp where version_esklp = '2.0') p using (gid)
where
k1.version_esklp = '2.0'
;
create unique index tmp_nsi_klp_ui on tmp_nsi_klp(gid,
parent_position_uuid);
analyze tmp_nsi_klp;
-- working set (!!This SQL never ends in 11.5 now)
create temporary table tmp_klp_replace on commit drop as
select distinct on (klp_gid)
*
from (
select
k2.gid as klp_gid,
k2.smnn_gid as klp_smnn_gid,
k2.position_uuid as klp_position_uuid,
k2.order_number as klp_order_number,
k2.is_active as klp_is_active,
k1.gid as klp_child_gid,
k1.smnn_gid as klp_child_smnn_gid,
k1.position_uuid as klp_child_position_uuid,
k1.order_number as klp_child_order_number,
k1.is_active as klp_child_is_active
from
tmp_nsi_klp k1
join tmp_nsi_klp k2 on (k2.position_uuid =
k1.parent_position_uuid)
union all
select
k1.gid as klp_gid,
k1.smnn_gid as klp_smnn_gid,
k1.position_uuid as klp_position_uuid,
k1.order_number as klp_order_number,
k1.is_active as klp_is_active,
null as klp_child_gid,
null as klp_child_smnn_gid,
null as klp_child_position_uuid,
null as klp_child_order_number,
null as klp_child_is_active
from
tmp_nsi_klp k1
left join tmp_nsi_klp k2 on (k1.position_uuid =
k2.parent_position_uuid)
left join (select position_uuid from tmp_nsi_klp where not
is_fake_series group by position_uuid having count(1) > 1) klp_series on
(klp_series.position_uuid = k1.position_uuid)
where
-- not exists(select 1 from tmp_nsi_klp k2 where k1.position_uuid =
k2.parent_position_uuid)
k2.gid is null -- none referenced
and klp_series.position_uuid is null -- klp series with the same
position_uuid
) a
order by
klp_gid,
klp_order_number desc
;
Characteristics of source table - tmp_nsi_klp:
create table tmp_nsi_klp (
gid uuid NULL, -- not null by the fact
smnn_gid uuid NULL, -- not null by the fact
position_uuid uuid NULL, -- not null by the fact
parent_position_uuid uuid NULL,
order_number int8 NULL,
date_start timestamp NULL, -- not null by the fact
date_end timestamp NULL,
is_active bool NULL, -- not null by the fact
is_fake_series bool NULL -- not null by the fact
);
Rows: 237279
Cols stats:
https://docs.google.com/spreadsheets/d/1Ocbult13kZ64vK9nHt-_BV3EENK_ZSHFTAmR
ZLISUIE/edit?usp=sharing
Execution plans for problematic query - working set "create temporary table
tmp_klp_replace on commit drop as":
On 11.5 (option nestloop enabled):
https://explain.depesz.com/s/pIzd
Exec time: never finished
On 9.6 (PostgreSQL 9.6.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-11), 64-bit):
https://explain.depesz.com/s/sO0G
Exec time: ~1 sec
On 11.5 (option nestloop disabled):
https://explain.depesz.com/s/eYzk
Exec time: ~1,5 sec
Construction "not exists(select 1 from tmp_nsi_klp k2 where k1.position_uuid
= k2.parent_position_uuid)" works perfectly but there are lots of similar
constructions in a code made for checking inclusion of data. Thus no chances
to change existing code to another using not exists construction. Are there
any options to bring initial statement to life and keep the server option
nestloop enable?
Give me a clue, pls.
Thanks in advance.
Andrew.
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2019-11-18 18:47:58 | Re: [HACKERS] proposal: schema variables |
Previous Message | Tomas Vondra | 2019-11-18 15:10:33 | Re: Out of memory error on automatic vacuum |