Fwd: Help with view performance problem

From: Chris Hoover <revoohc(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Fwd: Help with view performance problem
Date: 2005-07-28 14:38:06
Message-ID: 1d219a6f0507280738791f0ea3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-performance

Does anyone have any suggestions on this? I did not get any response
from the admin list.

Thanks,

Chris

---------- Forwarded message ----------
From: Chris Hoover <revoohc(at)gmail(dot)com>
Date: Jul 27, 2005 12:29 PM
Subject: Re: Help with view performance problem
To: pgsql-admin(at)postgresql(dot)org

I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze. This caused it to use the indexes as I have been expecting
it to do.

Now, how can I get it to use the indexes w/o manipulating the
environment? What make postgresql want to sequentially scan and use a
hash join?

thanks,

Chris

explain analyze with set_hashjoin=false;
prob_db=#explain analyze select * from clm_com;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan clm_com (cost=1057975.45..1169021.26 rows=126910
width=366) (actual time=142307.99..225997.22 rows=1268649 loops=1)
-> Unique (cost=1057975.45..1169021.26 rows=126910 width=366)
(actual time=142307.96..206082.30 rows=1268649 loops=1)
-> Sort (cost=1057975.45..1061148.19 rows=1269095
width=366) (actual time=142307.95..156019.01 rows=1268649 loops=1)
Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
clmcom1.pat_zip, clmcom1.payto_addr_1,
clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
-> Merge Join (cost=0.00..565541.46 rows=1269095
width=366) (actual time=464.89..130638.06 rows=1268649 loops=1)
Merge Cond: ("outer".inv_nbr = "inner".inv_nbr)
Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
-> Index Scan using clmcom1_inv_nbr_iview_idx on
clmcom1 (cost=0.00..380534.32 rows=1269095 width=270) (actual
time=0.27..82159.37 rows=1268649 loops=1)
-> Index Scan using clmcom2_inv_nbr_iview_idx on
clmcom2 (cost=0.00..159636.25 rows=1271198 width=96) (actual
time=464.56..21774.02 rows=1494019 loops=1)
Total runtime: 227369.39 msec
(10 rows)

On 7/27/05, Chris Hoover <revoohc(at)gmail(dot)com> wrote:
> I am having a problem with a view on one of my db's. This view is
> trying to sequentially can the 2 tables it is accessing. However,
> when I explain the view on most of my other db's (all have the same
> schema's), it is using the indexes. Can anyone please help me
> understand why postgres is choosing to sequenially scan both tables?
>
> Both tables in the view have a primary key defined on inv_nbr,
> inv_qfr. Vacuum and analyze have been run on the tables in question
> to try and make sure stats are up to date.
>
> Thanks,
>
> Chris
> PG - 7.3.4
> RH 2.1
>
>
> Here is the view definition:
> SELECT DISTINCT clmcom1.inv_nbr AS inventory_number,
> clmcom1.inv_qfr AS inventory_qualifier,
> clmcom1.pat_addr_1 AS patient_address_1,
> clmcom1.pat_addr_2 AS patient_address_2,
> clmcom1.pat_city AS patient_city,
> clmcom1.pat_cntry AS patient_country,
> clmcom1.pat_dob AS patient_date_of_birth,
> clmcom1.pat_gender_cd AS patient_gender_code,
> clmcom1.pat_info_pregnancy_ind AS pregnancy_ind,
> clmcom1.pat_state AS patient_state,
> clmcom1.pat_suffix AS patient_suffix,
> clmcom1.pat_zip AS patient_zip_code,
> clmcom1.payto_addr_1 AS payto_address_1,
> clmcom1.payto_addr_2 AS payto_address_2,
> clmcom1.payto_city,
> clmcom1.payto_cntry AS payto_country,
> clmcom1.payto_f_name AS payto_first_name,
> clmcom1.payto_m_name AS payto_middle_name,
> clmcom1.payto_state,
> clmcom1.payto_zip AS payto_zip_code,
> clmcom1.clm_tot_clm_chgs AS total_claim_charge,
> clmcom1.bill_l_name_org AS
> billing_last_name_or_org,
> clmcom1.clm_delay_rsn_cd AS
> claim_delay_reason_code,
> clmcom1.clm_submit_rsn_cd AS
> claim_submit_reason_code,
> clmcom1.payto_l_name_org AS
> payto_last_name_or_org,
> clmcom1.payto_prim_id AS payto_primary_id,
> clmcom1.bill_prim_id AS billing_prov_primary_id,
> clmcom1.clm_tot_ncov_chgs AS total_ncov_charge,
> clmcom2.contract_amt AS contract_amount,
> clmcom2.svc_fac_or_lab_name,
> clmcom2.svc_fac_addr_1 AS svc_fac_address_1,
> clmcom2.svc_fac_addr_2 AS svc_fac_address_2,
> clmcom2.svc_fac_city,
> clmcom2.svc_fac_zip AS svc_fac_zip_code
> FROM (clmcom1 LEFT JOIN clmcom2 ON (((clmcom1.inv_nbr =
> clmcom2.inv_nbr) AND
>
> (clmcom1.inv_qfr = clmcom2.inv_qfr))))
> ORDER BY clmcom1.inv_nbr,
> clmcom1.inv_qfr,
> clmcom1.pat_addr_1,
> clmcom1.pat_addr_2,
> clmcom1.pat_city,
> clmcom1.pat_cntry,
> clmcom1.pat_dob,
> clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind,
> clmcom1.pat_state,
> clmcom1.pat_suffix,
> clmcom1.pat_zip,
> clmcom1.payto_addr_1,
> clmcom1.payto_addr_2,
> clmcom1.payto_city,
> clmcom1.payto_cntry,
> clmcom1.payto_f_name,
> clmcom1.payto_m_name,
> clmcom1.payto_state,
> clmcom1.payto_zip,
> clmcom1.clm_tot_clm_chgs,
> clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd,
> clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org,
> clmcom1.payto_prim_id,
> clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs,
> clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name,
> clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2,
> clmcom2.svc_fac_city,
> clmcom2.svc_fac_zip;
>
>
> Here is the explain analyze from the problem db:
> prob_db=# explain analyze select * from clm_com;
>
>
>
> QUERY PLAN
>
>
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Subquery Scan clm_com (cost=1039824.35..1150697.61 rows=126712
> width=367) (actual time=311792.78..405819.03 rows=1266114 loops=1)
> -> Unique (cost=1039824.35..1150697.61 rows=126712 width=367)
> (actual time=311792.74..386313.14 rows=1266114 loops=1)
> -> Sort (cost=1039824.35..1042992.16 rows=1267123
> width=367) (actual time=311792.74..338189.48 rows=1266114 loops=1)
> Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1,
> clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
> clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
> clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
> -> Hash Join (cost=132972.78..548171.70 rows=1267123
> width=367) (actual time=16999.32..179359.43 rows=1266114 loops=1)
> Hash Cond: ("outer".inv_nbr = "inner".inv_nbr)
> Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
> -> Seq Scan on clmcom1 (cost=0.00..267017.23
> rows=1267123 width=271) (actual time=0.11..84711.83 rows=1266114
> loops=1)
> -> Hash (cost=111200.82..111200.82 rows=1269582
> width=96) (actual time=16987.45..16987.45 rows=0 loops=1)
> -> Seq Scan on clmcom2
> (cost=0.00..111200.82 rows=1269582 width=96) (actual
> time=0.07..12164.81 rows=1266108 loops=1)
> Total runtime: 407317.47 msec
> (11 rows)
> ~
>
>
>
> Here is the explain analyze from a good db (on the same postgres cluster);
> good_db=# explain analyze select * from clm_com;
>
>
>
>
>
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Subquery Scan clm_com (cost=78780.59..89498.29 rows=12249 width=359)
> (actual time=73045.36..79974.37 rows=122494 loops=1)
> -> Unique (cost=78780.59..89498.29 rows=12249 width=359) (actual
> time=73045.28..78031.99 rows=122494 loops=1)
> -> Sort (cost=78780.59..79086.81 rows=122488 width=359)
> (actual time=73045.28..73362.94 rows=122494 loops=1)
> Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1, clmcom1.payto_addr_2,
> clmcom1.payto_city, clmcom1.payto_cntry, clmcom1.payto_f_name,
> clmcom1.payto_m_name, clmcom1.payto_state, clmcom1.payto_zip,
> clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
> -> Merge Join (cost=0.00..56945.12 rows=122488
> width=359) (actual time=54.76..71635.65 rows=122494 loops=1)
> Merge Cond: (("outer".inv_nbr = "inner".inv_nbr)
> AND ("outer".inv_qfr = "inner".inv_qfr))
> -> Index Scan using clmcom1_pkey on clmcom1
> (cost=0.00..38645.61 rows=122488 width=267) (actual
> time=25.60..49142.16 rows=122494 loops=1)
> -> Index Scan using clmcom2_pkey on clmcom2
> (cost=0.00..16004.08 rows=122488 width=92) (actual
> time=29.09..19418.94 rows=122494 loops=1)
> Total runtime: 80162.26 msec
> (9 rows)
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Dan Harris 2005-07-28 16:14:32 Re: Fwd: Help with view performance problem
Previous Message Hans Peter Wiedau 2005-07-28 09:50:50 Re: Add a new disk to data directory

Browse pgsql-performance by date

  From Date Subject
Next Message Dan Harris 2005-07-28 16:14:32 Re: Fwd: Help with view performance problem
Previous Message Merlin Moncure 2005-07-28 13:02:18 Re: Finding bottleneck