Skip site navigation (1) Skip section navigation (2)

Poor performance using CTE

From: David Greco <David_Greco(at)harte-hanks(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Poor performance using CTE
Date: 2012-11-14 15:23:15
Message-ID: 187F6C10D2931A4386EE8E58E13857F61291C600@BY2PRD0811MB415.namprd08.prod.outlook.com (view raw or flat)
Thread:
Lists: pgsql-performance
Have a query using a CTE that is performing very poorly. The equivalent query against the same data in an Oracle database runs in under 1 second, in Postgres  it takes 2000 seconds.

The smp_pkg.get_invoice_charges queries fedexinvoices for some data and normalizes it into a SETOF some record type. It is declared to be STABLE. Fedexinvoices consists of about 1.3M rows of medium width. Fedexinvoices.id is the primary key on that table, and trim(fedexinvoices.trackno) is indexed via the function trim.

The plan for the equivalent query in Oracle is much smaller and simpler. No sequential (or full table) scans on fedexinvoices.



WITH charges as (
                SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id) charge_info from fedexinvoices fi2
)
select fedexinvoices.* from
fedexinvoices
inner join charges on charges.id = fedexinvoices.id AND (charges.charge_info).charge_name IN ('ADDRESS CORRECTION CHARGE','ADDRESS CORRECTION')
where
trim(fedexinvoices.trackno)='799159791643'
;

Explain Analyze output, I abbreviated some of the column lists for brevity:

Nested Loop  (cost=457380.38..487940.77 rows=1 width=1024) (actual time=1978019.858..1978019.858 rows=0 loops=1)
  Output: fedexinvoices.id, .........
  Join Filter: (fedexinvoices.id = charges.id)
  Buffers: shared hit=20387611, temp written=94071
  CTE charges
    ->  Seq Scan on hits.fedexinvoices fi2  (cost=0.00..457380.38 rows=1350513 width=8) (actual time=0.613..1964632.763 rows=9007863 loops=1)
          Output: fi2.id, smp_pkg.get_invoice_charges(fi2.id, NULL::character varying)
          Buffers: shared hit=20387606
  ->  Index Scan using fedexinvoices_trim_track_idx on hits.fedexinvoices  (cost=0.00..5.46 rows=1 width=1024) (actual time=0.024..0.026 rows=1 loops=1)
        Output: fedexinvoices.id, .........
        Index Cond: (btrim((fedexinvoices.trackno)::text) = '799159791643'::text)
        Buffers: shared hit=5
  ->  CTE Scan on charges  (cost=0.00..30386.54 rows=13471 width=8) (actual time=1978019.827..1978019.827 rows=0 loops=1)
        Output: charges.id, charges.charge_info
        Filter: (((charges.charge_info).charge_name)::text = ANY ('{"ADDRESS CORRECTION CHARGE","ADDRESS CORRECTION"}'::text[]))
        Buffers: shared hit=20387606, temp written=94071
Total runtime: 1978214.743 ms


Responses

pgsql-performance by date

Next:From: Andrew DunstanDate: 2012-11-14 15:50:58
Subject: Re: Poor performance using CTE
Previous:From: Jon NelsonDate: 2012-11-14 14:41:45
Subject: Re: postgres 8.4, COPY, and high concurrency

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group