From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Nagaraj Raj <nagaraj(dot)sf(at)yahoo(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Query performance issue |
Date: | 2021-01-22 02:35:14 |
Message-ID: | 20210122023514.GD27167@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Jan 22, 2021 at 01:53:26AM +0000, Nagaraj Raj wrote:
> Tables ddl are attached in dbfiddle -- Postgres 11 | db<>fiddle
> Postgres 11 | db<>fiddle
> Server configuration is: Version: 10.11RAM - 320GBvCPU - 32 "maintenance_work_mem" 256MB"work_mem" 1GB"shared_buffers" 64GB
> Aggregate (cost=31.54..31.55 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=1)
> -> Nested Loop (cost=0.00..31.54 rows=1 width=8) (actual time=0.007..0.008 rows=0 loops=1)
> Join Filter: (a.household_entity_proxy_id = c.household_entity_proxy_id)
> -> Nested Loop (cost=0.00..21.36 rows=1 width=16) (actual time=0.006..0.007 rows=0 loops=1)
> Join Filter: (a.individual_entity_proxy_id = b.individual_entity_proxy_id)
> -> Seq Scan on prospect a (cost=0.00..10.82 rows=1 width=16) (actual time=0.006..0.006 rows=0 loops=1)
> Filter: (((last_contacted_anychannel_dttm IS NULL) OR (last_contacted_anychannel_dttm < '2020-11-23 00:00:00'::timestamp without time zone)) AND (shared_paddr_with_customer_ind = 'N'::bpchar) AND (profane_wrd_ind = 'N'::bpchar) AND (tmo_ofnsv_name_ind = 'N'::bpchar) AND (has_individual_address = 'Y'::bpchar) AND (has_last_name = 'Y'::bpchar) AND (has_first_name = 'Y'::bpchar))
> -> Seq Scan on individual_demographic b (cost=0.00..10.53 rows=1 width=8) (never executed)
> Filter: ((tax_bnkrpt_dcsd_ind = 'N'::bpchar) AND (govt_prison_ind = 'N'::bpchar) AND ((cstmr_prspct_ind)::text = 'Prospect'::text))
> -> Seq Scan on household_demographic c (cost=0.00..10.14 rows=3 width=8) (never executed)
> Filter: (((hspnc_lang_prfrnc_cval)::text = ANY ('{B,E,X}'::text[])) OR (hspnc_lang_prfrnc_cval IS NULL))
> Planning Time: 1.384 ms
> Execution Time: 0.206 ms
> 13 rows
It's doing nested loops with estimated rowcount=1, which indicates a bad
underestimate, and suggests that the conditions are redundant or correlated.
Maybe you can handle this with MV stats on the correlated columns:
CREATE STATISTICS prospect_stats (dependencies) ON
shared_paddr_with_customer_ind, profane_wrd_ind, tmo_ofnsv_name_ind, has_individual_address, has_last_name, has_first_name
FROM prospect;
CREATE STATISTICS individual_demographic_stats (dependencies) ON
tax_bnkrpt_dcsd_ind, govt_prison_ind, cstmr_prspct_ind
FROM individual_demographic_stats
ANALYZE prospect, individual_demographic_stats ;
Since it's expensive to compute stats on large number of columns, I'd then
check *which* are correlated and then only compute MV stats on those. This
will show col1=>col2: X where X approaches 1, the conditions are highly
correlated:
SELECT * FROM pg_statistic_ext; -- pg_statistic_ext_data since v12
Also, as a diagnostic tool to get "explain analyze" to finish, you can
SET enable_nestloop=off;
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2021-01-23 09:50:29 | Re: proposal: schema variables |
Previous Message | Nagaraj Raj | 2021-01-22 01:53:26 | Query performance issue |