Re: Performance weirdness with/without vacuum analyze

From: Shridhar Daithankar <shridhar_daithankar(at)myrealbox(dot)com>
To: Harry Broomhall <harry(dot)broomhall(at)uk(dot)easynet(dot)net>, Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance weirdness with/without vacuum analyze
Date: 2003-10-21 12:00:08
Message-ID: 3F951FC8.2090906@myrealbox.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Harry Broomhall wrote:
> #effective_cache_size = 1000 # typically 8KB each
> #random_page_cost = 4 # units are one sequential page fetch cost

You must tune the first one at least. Try
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html to tune these
parameters.

>>2) The EXPLAIN ANALYZE of each query instead of just the EXPLAIN
>
>
> First the case with no vacuum analyze:
>
> QUERY PLAN
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> Merge Join (cost=99.32..171.32 rows=1000 width=259) (actual
time=18579.92..48277.69 rows=335671 loops=1)
> Merge Cond: ("outer".cdr_id = "inner".cdr_id)
> -> Index Scan using import_cdrs_cdr_id_key on import_cdrs
(cost=0.00..52.00 rows=1000 width=164) (actual time=0.42..11479.51 rows=335671
loops=1)
> -> Sort (cost=99.32..101.82 rows=1000 width=95) (actual
time=18578.71..21155.65 rows=335671 loops=1)
> Sort Key: un.cdr_id
> -> Hash Join (cost=6.99..49.49 rows=1000 width=95) (actual
time=4.70..10011.35 rows=335671 loops=1)
> Hash Cond: ("outer".interim_cli = "inner".interim_num)
> Join Filter: (("outer".starttime >= "inner".starttime) AND
("outer".starttime <= "inner".endtime))
> -> Seq Scan on import_cdrs un (cost=0.00..20.00 rows=1000
width=49) (actual time=0.02..4265.63 rows=335671 loops=1)
> -> Hash (cost=6.39..6.39 rows=239 width=46) (actual
time=4.57..4.57 rows=0 loops=1)
> -> Seq Scan on num_xlate (cost=0.00..6.39 rows=239
width=46) (actual time=0.12..2.77 rows=239 loops=1)
> Total runtime: 80408.42 msec
> (12 rows)

You are lucky to get a better plan here because planner is way off w.r.t
estimated number of rows.
>
> And now the case *with* the vacuum analyze:
>
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------------------
> Hash Join (cost=15335.91..49619.57 rows=335671 width=202) (actual
time=12383.44..49297.58 rows=335671 loops=1)
> Hash Cond: ("outer".cdr_id = "inner".cdr_id)
> -> Seq Scan on import_cdrs (cost=0.00..8496.71 rows=335671 width=126)
(actual time=0.15..9504.24 rows=335671 loops=1)
> -> Hash (cost=10398.73..10398.73 rows=335671 width=76) (actual
time=12371.13..12371.13 rows=0 loops=1)
> -> Hash Join (cost=6.99..10398.73 rows=335671 width=76) (actual
time=4.91..9412.55 rows=335671 loops=1)
> Hash Cond: ("outer".interim_cli = "inner".interim_num)
> Join Filter: (("outer".starttime >= "inner".starttime) AND
("outer".starttime <= "inner".endtime))
> -> Seq Scan on import_cdrs un (cost=0.00..8496.71
rows=335671 width=30) (actual time=0.09..3813.54 rows=335671 loops=1)
> -> Hash (cost=6.39..6.39 rows=239 width=46) (actual
time=4.71..4.71 rows=0 loops=1)
> -> Seq Scan on num_xlate (cost=0.00..6.39 rows=239
width=46) (actual time=0.22..2.90 rows=239 loops=1)
> Total runtime: 432543.73 msec
> (11 rows)
>

What happens if you turn off hash joins? Also bump sort memory to something
good.. around 16MB and see what difference does it make to performance..

Shridhar

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gaetano Mendola 2003-10-21 12:12:34 Re: index file bloating still in 7.4 ?
Previous Message Harry Broomhall 2003-10-21 11:40:26 Re: Performance weirdness with/without vacuum analyze