Horribly slow query/ sequential scan

From: "Gregory S(dot) Williamson" <gsw(at)globexplorer(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Horribly slow query/ sequential scan
Date: 2007-01-09 11:55:56
Message-ID: 71E37EF6B7DCC1499CEA0316A256832802B3EB8A@loki.wc.globexplorer.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

This is a query migrated from postgres. In postgres it runs about 10,000 times *slower* than on informix on somewhat newer hardware. The problem is entirely due to the planner. This PostgreSQL 8.1.4 on linux, 2 gigs of ram.

The table:
Table "reporting.bill_rpt_work"
Column | Type | Modifiers
---------------+-----------------------+-----------
report_id | integer |
client_id | character varying(10) |
contract_id | integer | not null
rate | numeric | not null
appid | character varying(10) | not null
userid | text | not null
collection_id | integer | not null
client_name | character varying(60) |
use_sius | integer | not null
is_subscribed | integer | not null
hits | numeric | not null
sius | numeric | not null
total_amnt | numeric | not null
royalty_total | numeric |
Indexes:
"billrptw_ndx" UNIQUE, btree (report_id, client_id, contract_id, rate, appid, userid, collection_id)
"billrpt_cntrct_ndx" btree (report_id, contract_id, client_id)
"billrpt_collid_ndx" btree (report_id, collection_id, client_id, contract_id)
Foreign-key constraints:
"$1" FOREIGN KEY (report_id) REFERENCES billing_reports(report_id)
"$2" FOREIGN KEY (client_id) REFERENCES "work".clients(client_id)

The query:
explain analyze select
w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
sum(w.sius) * w.rate AS BYIUS
from bill_rpt_work w, billing_reports b
where w.report_id in
(select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
------------------------------
GroupAggregate (cost=298061335.44..298259321.62 rows=26 width=58) (actual time=372213.673..372213.689 rows=2 loops=1)
-> Sort (cost=298061335.44..298083333.83 rows=8799354 width=58) (actual time=372213.489..372213.503 rows=37 loops=1)
Sort Key: w.appid, w.rate, w.is_subscribed
-> Nested Loop (cost=0.00..296121313.45 rows=8799354 width=58) (actual time=286628.486..372213.053 rows=37 loops=1)
Join Filter: (subplan)
-> Seq Scan on bill_rpt_work w (cost=0.00..85703.20 rows=11238 width=62) (actual time=1.239..1736.746 rows=61020 loops=1)
Filter: (((client_id)::text = '227400001'::text) OR ((client_id)::text = '2274000010'::text))
-> Seq Scan on billing_reports b (cost=0.00..29.66 rows=1566 width=8) (actual time=0.001..0.879 rows=1566 loops=61020)
SubPlan
-> Result (cost=0.00..29.66 rows=1566 width=0) (actual time=0.000..0.002 rows=1 loops=95557320)
One-Time Filter: ($1 = '2006-09-30'::date)
-> Seq Scan on billing_reports (cost=0.00..29.66 rows=1566 width=0) (actual time=0.001..0.863 rows=1565 loops=61020)
Total runtime: 372214.085 ms

Informix uses report id/client id as an index, thus eliminating a huge number of rows. The table has 2280545 rows currently; slightly fewer when the above analyze was run. Informix has about 5 times as much data.

select count(*) from bill_rpt_work where report_id in (select report_id from billing_reports where report_s_date = '2006-09-30') and (client_id = '227400001' or client_id = '2274000010');
count
-------
37
(1 row)

So scanning everything seems particularly senseless.

I had some success adding client id and report id to the initial select list, but that causes all sorts of problems in calling procedures that expect different data grouping.

Any suggestion would be welcome because this is a horrible show stopper.

Thanks,

Greg Williamson
DBA
GlobeXplorer LLC

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2007-01-09 12:33:15 Re: High update activity, PostgreSQL vs BigDBMS
Previous Message Merlin Moncure 2007-01-09 00:35:31 Re: table partioning performance