PG8.2.1 choosing slow seqscan over idx scan

From: "Jeremy Haile" <jhaile(at)fastmail(dot)fm>
To: pgsql-performance(at)postgresql(dot)org
Subject: PG8.2.1 choosing slow seqscan over idx scan
Date: 2007-01-16 21:23:00
Message-ID: 1168982580.8275.1169588211@webmail.messagingengine.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Running PostgreSQL 8.2.1 on Win32. The query planner is choosing a seq
scan over index scan even though index scan is faster (as shown by
disabling seqscan). Table is recently analyzed and row count estimates
seem to be in the ballpark.

Another tidbit - I haven't done a "vacuum full" ever, although I "vacuum
analyze" regularly (and autovacuum). I recently noticed that the PG
data drive is 40% fragmented (NTFS). Could that be making the seqscan
slower than it should be? Regardless of the fragmentations affect on
performance, is the query planner making a good decision here?

SOME CONFIGURATION PARAMS
effective_cache_size=1000MB
random_page_cost=3
default_statistics_target=50
shared_buffers=400MB
temp_buffers=10MB
work_mem=10MB
checkpoint_segments=12

QUERY
select merchant_dim_id,
dcms_dim_id,
sum(success) as num_success,
sum(failed) as num_failed,
count(*) as total_transactions,
(sum(success) * 1.0 / count(*)) as success_rate
from transaction_facts
where transaction_date >= '2007-1-16'
and transaction_date < '2007-1-16 15:20'
group by merchant_dim_id, dcms_dim_id;

EXPLAIN ANALYZE (enable_seqscan=true)
HashAggregate (cost=339573.01..340089.89 rows=15904 width=16) (actual
time=140606.593..140650.573 rows=10549 loops=1)
-> Seq Scan on transaction_facts (cost=0.00..333928.25 rows=322558
width=16) (actual time=19917.957..140036.910 rows=347434 loops=1)
Filter: ((transaction_date >= '2007-01-16 00:00:00'::timestamp
without time zone) AND (transaction_date < '2007-01-16
15:20:00'::timestamp without time zone))
Total runtime: 140654.813 ms

EXPLAIN ANALYZE (enable_seqscan=false)
HashAggregate (cost=379141.53..379658.41 rows=15904 width=16) (actual
time=3720.838..3803.748 rows=10549 loops=1)
-> Bitmap Heap Scan on transaction_facts (cost=84481.80..373496.76
rows=322558 width=16) (actual time=244.568..3133.741 rows=347434
loops=1)
Recheck Cond: ((transaction_date >= '2007-01-16
00:00:00'::timestamp without time zone) AND (transaction_date <
'2007-01-16 15:20:00'::timestamp without time zone))
-> Bitmap Index Scan on transaction_facts_transaction_date_idx
(cost=0.00..84401.16 rows=322558 width=0) (actual
time=241.994..241.994 rows=347434 loops=1)
Index Cond: ((transaction_date >= '2007-01-16
00:00:00'::timestamp without time zone) AND
(transaction_date < '2007-01-16 15:20:00'::timestamp
without time zone))
Total runtime: 3810.795 ms

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-01-16 21:39:07 Re: PG8.2.1 choosing slow seqscan over idx scan
Previous Message Richard Huxton 2007-01-16 19:24:32 Re: Table Size