Re: Forcing index scan on query produces 16x faster

From: "Eger, Patrick" <peger(at)automotive(dot)com>
To: "Christian Brink" <cbrink(at)r-stream(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Forcing index scan on query produces 16x faster
Date: 2010-03-18 01:01:16
Message-ID: 1CFD7891521AAB4E8201FB7A78C9D36F05453FD7@mail-001.corp.automotive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm running 8.4.2 and have noticed a similar heavy preference for
sequential scans and hash joins over index scans and nested loops. Our
database is can basically fit in cache 100% so this may not be
applicable to your situation, but the following params seemed to help
us:

seq_page_cost = 1.0
random_page_cost = 1.01
cpu_tuple_cost = 0.0001
cpu_index_tuple_cost = 0.00005
cpu_operator_cost = 0.000025
effective_cache_size = 1000MB
shared_buffers = 1000MB

Might I suggest the Postgres developers reconsider these defaults for
9.0 release, or perhaps provide a few sets of tuning params for
different workloads in the default install/docs? The cpu_*_cost in
particular seem to be way off afaict. I may be dead wrong though, fwiw
=)

-----Original Message-----
From: pgsql-performance-owner(at)postgresql(dot)org
[mailto:pgsql-performance-owner(at)postgresql(dot)org] On Behalf Of Christian
Brink
Sent: Wednesday, March 17, 2010 2:26 PM
To: pgsql-performance(at)postgresql(dot)org
Subject: [PERFORM] Forcing index scan on query produces 16x faster

I am running into a problem with a particular query. The execution plan
cost shows that the Seq Scan is a better bet (cost=54020.49..54020.55)
over the forced index 'enable_seqscan = false'
(cost=1589703.87..1589703.93). But when I run the query both ways I get
a vastly different result (below). It appears not to want to bracket the

salesitems off of the 'id' foreign_key unless I force it.

Is there a way to rewrite or hint the planner to get me the better plan
without resorting to 'enable_seqscan' manipulation (or am I missing
something)?

postream=> select version();
version
------------------------------------------------------------------------
-------------------------------------------------
PostgreSQL 8.0.3 on i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 4.0.0 20050505 (Red Hat 4.0.0-4)

postream=> SET enable_seqscan = false;
SET
postream=> EXPLAIN ANALYZE
postream-> SELECT si.group1_id as name, sum(si.qty) as count,
sum(si.amt) as amt
postream-> FROM salesitems si, sales s, sysstrings
postream-> WHERE si.id = s.id
postream-> AND si.group1_id != ''
postream-> AND si.group1_id IS NOT NULL
postream-> AND NOT si.void
postream-> AND NOT s.void
postream-> AND NOT s.suspended
postream-> AND s.tranzdate >= (cast('2010-02-15' as date) +
cast(sysstrings.data as time))
postream-> AND s.tranzdate < ((cast('2010-02-15' as date) + 1) +
cast(sysstrings.data as time))
postream-> AND sysstrings.id='net/Console/Employee/Day End Time'
postream-> GROUP BY name;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------------------
HashAggregate (cost=1589703.87..1589703.93 rows=13 width=35) (actual
time=33.414..33.442 rows=12 loops=1)
-> Nested Loop (cost=0.01..1588978.22 rows=96753 width=35) (actual

time=0.284..22.115 rows=894 loops=1)
-> Nested Loop (cost=0.01..2394.31 rows=22530 width=4)
(actual time=0.207..4.671 rows=225 loops=1)
-> Index Scan using sysstrings_pkey on sysstrings
(cost=0.00..5.78 rows=1 width=175) (actual time=0.073..0.078 rows=1
loops=1)
Index Cond: (id = 'net/Console/Employee/Day End
Time'::text)
-> Index Scan using sales_tranzdate_index on sales s
(cost=0.01..1825.27 rows=22530 width=12) (actual time=0.072..3.464
rows=225 loops=1)
Index Cond: ((s.tranzdate >= ('2010-02-15'::date +

("outer".data)::time without time zone)) AND (s.tranzdate <
('2010-02-16'::date + ("outer".data)::time without time zone)))
Filter: ((NOT void) AND (NOT suspended))
-> Index Scan using salesitems_pkey on salesitems si
(cost=0.00..70.05 rows=30 width=39) (actual time=0.026..0.052 rows=4
loops=225)
Index Cond: (si.id = "outer".id)
Filter: ((group1_id <> ''::text) AND (group1_id IS NOT
NULL) AND (NOT void))
Total runtime: 33.734 ms
(12 rows)

postream=> SET enable_seqscan = true;
SET
postream=> EXPLAIN ANALYZE
postream-> SELECT si.group1_id as name, sum(si.qty) as count,
sum(si.amt) as amt
postream-> FROM salesitems si, sales s, sysstrings
postream-> WHERE si.id = s.id
postream-> AND si.group1_id != ''
postream-> AND si.group1_id IS NOT NULL
postream-> AND NOT si.void
postream-> AND NOT s.void
postream-> AND NOT s.suspended
postream-> AND s.tranzdate >= (cast('2010-02-15' as date) +
cast(sysstrings.data as time))
postream-> AND s.tranzdate < ((cast('2010-02-15' as date) + 1) +
cast(sysstrings.data as time))
postream-> AND sysstrings.id='net/Console/Employee/Day End Time'
postream-> GROUP BY name;

QUERY PLAN
------------------------------------------------------------------------
------------------------------------------------------------------------
----------------------------------------------------------
HashAggregate (cost=54020.49..54020.55 rows=13 width=35) (actual
time=5564.929..5564.957 rows=12 loops=1)
-> Hash Join (cost=2539.63..53294.84 rows=96753 width=35) (actual
time=5502.324..5556.262 rows=894 loops=1)
Hash Cond: ("outer".id = "inner".id)
-> Seq Scan on salesitems si (cost=0.00..30576.60
rows=885215 width=39) (actual time=0.089..3099.453 rows=901249 loops=1)
Filter: ((group1_id <> ''::text) AND (group1_id IS NOT
NULL) AND (NOT void))
-> Hash (cost=2394.31..2394.31 rows=22530 width=4) (actual
time=3.329..3.329 rows=0 loops=1)
-> Nested Loop (cost=0.01..2394.31 rows=22530 width=4)

(actual time=0.217..2.749 rows=225 loops=1)
-> Index Scan using sysstrings_pkey on
sysstrings (cost=0.00..5.78 rows=1 width=175) (actual time=0.077..0.085

rows=1 loops=1)
Index Cond: (id = 'net/Console/Employee/Day
End Time'::text)
-> Index Scan using sales_tranzdate_index on
sales s (cost=0.01..1825.27 rows=22530 width=12) (actual
time=0.074..1.945 rows=225 loops=1)
Index Cond: ((s.tranzdate >=
('2010-02-15'::date + ("outer".data)::time without time zone)) AND
(s.tranzdate < ('2010-02-16'::date + ("outer".data)::time without time
zone)))
Filter: ((NOT void) AND (NOT suspended))
Total runtime: 5565.262 ms
(13 rows)

--
Christian Brink

--
Sent via pgsql-performance mailing list
(pgsql-performance(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Dimitri Fontaine 2010-03-18 10:24:25 Re: shared_buffers advice
Previous Message Ben Chobot 2010-03-17 22:53:43 Re: Testing FusionIO