Re: Performance issues with custom functions

From: "Edward Di Geronimo Jr(dot)" <edigeronimo(at)xtracards(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance issues with custom functions
Date: 2005-10-27 14:35:41
Message-ID: 4360E5BD.2040508@xtracards.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom Lane wrote:

>This is fairly hard to read ... it would help a lot if you had shown the
>view definitions that the query relies on, so that we could match up the
>plan elements with the query a bit better.
>
>
I wasn't sure how helpful it would be. Here they are:

create view development.network as
select * from odbc_select('amsterdam', 'bob.dbo.network') as (
network_id varchar ,
status_cd varchar ,
name varchar ,
network_action varchar ,
physical_type_cd varchar ,
service_type_cd varchar ,
parent_network_id varchar ,
commission_network_id varchar ,
rep_id varchar ,
tax_id varchar ,
url varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar ,
service varchar (30),
cost_routine varchar (150),
commission_rate numeric(5, 5) ,
directory_number varchar (11),
search_url varchar (200),
member_rate numeric(15, 2) ,
free_months numeric(18, 0) ,
eligibility_hound varchar (60)
)

create view development.network_state as
select * from odbc_select('amsterdam', 'bob.dbo.network_state') as (
network_id varchar,
state_cd varchar,
product varchar (100) ,
status_cd varchar,
entry_method_cd varchar,
entry_individual_type_cd varchar,
entry_individual_id varchar,
logo_id int ,
from_date timestamp ,
thru_date timestamp
)

create view development.xlat_tbl as
select * from odbc_select('amsterdam', 'xlat_tbl') as (
field_name varchar ,
field_value varchar ,
status_cd varchar ,
descr varchar ,
descrshort varchar ,
entry_method_cd varchar ,
entry_individual_type_cd varchar ,
entry_individual_id varchar
)

>However, I'm thinking the problem is with this IN clause:
>
>
>
>> where pl.network_id in (select ns.network_id
>> from development.network_state ns
>> where ns.from_date < current_time
>> and (ns.thru_date > current_time or
>>ns.thru_date is null)
>> and (ns.state_cd = pl.state_cd or ns.state_cd='')
>> )
>>
>>
>
>Because the sub-SELECT references pl.state_cd (an outer variable
>reference), there's no chance of optimizing this into a join-style IN.
>So the sub-SELECT has to be re-executed for each row of the outer query.
>
>BTW, it's not apparent to me that your "flattened" query gives the same
>answers as the original. What if a pl row can join to more than one
>row of the ns output?
>
>
Well, I guess you are right. As far as the database can tell, the
queries aren't the same. In practice, they are. network_state is
essentially tracking our contract dates with different discount
healthcare networks. from_date and thru_date track the timeframe we use
that network, with thru_date being null for the current networks. Some
networks cover all states, in which case state_cd is an empty string.
Otherwise, there will be a row per state covered. I can't think of any
way to enforce data integrity on this other than maybe via triggers. Is
there any way to make things more clear to the database (both in general
and on the postgres end of this) ? At the moment, the SQL Server table
has the primary key defined as (network_id, state_cd, product), which is
ok for now, but I'm realizing going forward could be an issue if we ever
stopped using a network in a state and then went back to it.

I guess the next question is, is there any way I can give postgres hints
about what constraints exist on the data in these views?

Ed

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-10-27 14:38:02 Re: Perfomance of views
Previous Message PFC 2005-10-27 14:33:51 Re: browsing table with 2 million records