Re: Worse perfomance on 8.2.0 than on 7.4.14

From: Rolf Østvik <rolfostvik(at)yahoo(dot)no>
To: pgsql-performance(at)postgresql(dot)org
Cc: Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org>
Subject: Re: Worse perfomance on 8.2.0 than on 7.4.14
Date: 2007-01-01 15:04:10
Message-ID: 20070101150410.51809.qmail@web26309.mail.ukl.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

(I am sorry if my explain analyze outputs gets
garbled)

--- Dennis Bjorklund <db(at)zigo(dot)dhs(dot)org> skrev:

> Rolf Østvik skrev:
>
> > I have a simple query which uses 32ms on 7.4.14
> and 1015ms on 8.2.0.
> > I guess 7.4.14 creates a better execution plan
> than 8.2.0 for this query but
>
>
> Try to turn off planner options in 8.2 to make it
> generate the same plan
> as 7.4. Then run EXPLAIN ANALYZE on that query that
> generate the same
> plan as in 7.4 and we can compare the costs and
> maybe understand what go
> wrong.
>
> /Dennis
>

I have adjusted some settings in the postgresql.conf
file:
7.4.14
shared_buffers=64000 #512MB
sort_mem=32000 #32KB
effective_cache_size=128000 #1GB
8.2.0
shared_buffers=512MB
temp_buffers=32MB
work_mem=8MB
effective_cache_size=1GB
random_page_cost=2.0

And also disabled some planner options in 8.2
enable_bitscanmap = off
enable_hashjoin = off

NB: enable_seqscan = on (default value)

First i have some queries to give you a feel of size
of
datasets and plans and times.

Q-A: (Simple query A)
select sr.id from step_result_subset as sr
where sr.step_parent = 0;
Q-B: (Simple query B)
select ur.id from uut_result_subset as ur
where ur.start_date_time > '2006-12-11';
Q-C: (Simple query C)
select ur.id from uut_result_subset as ur
where ur.start_date_time > '2006-12-11';

7.4.14
Explain analyze of Q-A on 7.1.14:
Index Scan using step_result_subset_parent_key on
step_result_subset sr (cost=0.00..42793.67
rows=166069 width=4) (actual time=0.091..1201.073
rows=176449 loops=1)
Index Cond: (step_parent = 0)
Total runtime: 1263.592 ms
(3 rows)

Explain analyze of Q-B on 7.1.14:
Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur (cost=0.00..63.28 rows=18
width=4) (actual time=0.081..0.190 rows=68 loops=1)
Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
Total runtime: 0.242 ms
(3 rows)

Explain analyze of Q-C on 7.1.14:
Seq Scan on uut_result_subset ur (cost=0.00..3161.94
rows=28640 width=4) (actual time=0.059..108.159
rows=29144 loops=1)
Filter: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
Total runtime: 117.560 ms
(3 rows)

8.2.0
Explain analyze of Q-A on 8.2.0:
Index Scan using step_result_subset_parent_key on
step_result_subset sr (cost=0.00..26759.41
rows=143163 width=4) (actual time=0.099..924.039
rows=176449 loops=1)
Index Cond: (step_parent = 0)
Total runtime: 998.757 ms
(3 rows)

Explain analyze of Q-A on 8.2.0:
Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur (cost=0.00..196.15 rows=118
width=4) (actual time=0.025..0.081 rows=68 loops=1)
Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
Total runtime: 0.159 ms
(3 rows)

Explain analyze of Q-C on 8.2.0:
Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur (cost=0.00..2382.39 rows=31340
width=4) (actual time=0.035..35.367 rows=29144
loops=1)
Index Cond: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
Total runtime: 47.168 ms
(3 rows)

Here is the complex query/view.
create view view_subset as
select
ur.id as ur_id,
sr.id as sr_id
from
uut_result_subset as ur
inner join step_result_subset as sr
on ur.id=sr.uut_result
where
ur.start_date_time > '2006-12-11'
and sr.step_parent=0
;

Query with start_date_time > '2006-12-11' on 7.4.14
Query PLAN
---------------
Nested Loop (cost=0.00..1400.86 rows=17 width=8)
(actual time=0.066..12.754 rows=68 loops=1)
-> Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur (cost=0.00..63.28 rows=18
width=4) (actual time=0.019..0.136 rows=68 loops=1)
Index Cond: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
-> Index Scan using step_result_uut_result_idx on
step_result_subset sr (cost=0.00..74.28 rows=2
width=8) (actual time=0.085..0.182 rows=1 loops=68)
Index Cond: ("outer".id = sr.uut_result)
Filter: (step_parent = 0)
Total runtime: 12.849 ms

Query with start_date_time > '2006-12-11' on 8.2.0
Query PLAN
---------------
Nested Loop (cost=0.00..35860.83 rows=96 width=8)
(actual time=11.891..2339.878 rows=68 loops=1)
-> Index Scan using step_result_subset_parent_key
on step_result_subset sr (cost=0.00..26759.41
rows=143163 width=8) (actual time=0.083..1017.500
rows=176449 loops=1)
Index Cond: (step_parent = 0)
-> Index Scan using uut_result_subset_pkey on
uut_result_subset ur (cost=0.00..0.05 rows=1 width=4)
(actual time=0.006..0.006 rows=0 loops=176449)
Index Cond: (ur.id = sr.uut_result)
Filter: (start_date_time > '2006-12-11
00:00:00'::timestamp without time zone)
Total runtime: 2339.974 ms

I also wanted to try it with a bigger dataset so i
set the restriction of start_date_time to
"start_date_time> '2006-09-11'"
I also then set "enable_hashjoin = on" to get same
plans on 7.4.14 and 8.2.0.

Query with start_date_time > '2006-09-11' on 7.4.14
Query PLAN
---------------
Hash Join (cost=3233.54..47126.96 rows=26940
width=8) (actual time=126.437..1489.584 rows=29139
loops=1)
Hash Cond: ("outer".uut_result = "inner".id)
-> Index Scan using step_result_subset_parent_key
on step_result_subset sr (cost=0.00..42793.67
rows=166069 width=8) (actual time=0.078..1137.123
rows=176449 loops=1)
Index Cond: (step_parent = 0)
-> Hash (cost=3161.94..3161.94 rows=28640
width=4) (actual time=126.068..126.068 rows=0 loops=1)
-> Seq Scan on uut_result_subset ur
(cost=0.00..3161.94 rows=28640 width=4) (actual
time=0.063..107.041 rows=29144 loops=1)
Filter: (start_date_time > '2006-09-11
00:00:00'::timestamp without time zone)
Total runtime: 1504.600 ms
(8 rows)

Query with start_date_time > '2006-09-11' on 8.2.0
Query PLAN
---------------
Hash Join (cost=2460.74..32695.45 rows=25413
width=8) (actual time=61.453..1198.048 rows=29139
loops=1)
Hash Cond: (sr.uut_result = ur.id)
-> Index Scan using step_result_subset_parent_key
on step_result_subset sr (cost=0.00..26759.41
rows=143163 width=8) (actual time=0.089..937.124
rows=176449 loops=1)
Index Cond: (step_parent = 0)
-> Hash (cost=2382.39..2382.39 rows=31340
width=4) (actual time=55.975..55.975 rows=29144
loops=1)
-> Index Scan using
uut_result_subset_start_date_time_idx on
uut_result_subset ur (cost=0.00..2382.39 rows=31340
width=4) (actual time=0.051..35.635 rows=29144
loops=1)
Index Cond: (start_date_time >
'2006-09-11 00:00:00'::timestamp without time zone)
Total runtime: 1212.910 ms
(8 rows)

Other comments.
I am _beginning_ to get a feeling of adjusting
parameters and how my dataset behaves. 8.2.0 does (as
expected) work much better on bigger datasets than
7.4.14.
I was still hoping that i could get better response
times since i think the Q-C query
(ur.start_date_time > '2006-09-11') should be the
biggest restrictor of the datasets i want to look at.
>From what i can understand that is what happens with
the query plan for 7.4.14 when restriction is
"ur.start_date_time > '2006-12-11'".

Best regards
Rolf Østvik

__________________________________________________
Bruker du Yahoo!?
Lei av spam? Yahoo! Mail har den beste spambeskyttelsen
http://no.mail.yahoo.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rolf Østvik 2007-01-01 15:14:14 Re: Worse perfomance on 8.2.0 than on 7.4.14
Previous Message Tom Lane 2006-12-31 17:11:28 Re: Worse perfomance on 8.2.0 than on 7.4.14