Re: 8.3RC2 vs 8.2.6 testing results

From: "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com>
To: Vlad <marchenko(at)gmail(dot)com>
Cc: PG-General <pgsql-general(at)postgresql(dot)org>
Subject: Re: 8.3RC2 vs 8.2.6 testing results
Date: 2008-01-28 22:26:42
Message-ID: 162867790801281426l24d59ea0ob5a9c404471d562f@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello

8.3 plan is not optimal.

> -> Hash Join (cost=2379.09..108954.69 rows=550548 width=52)
> (actual time=76.188..8177.510 rows=2593557 loops=1)

please, try to increase statistics

default_statistics_target (in postgresql.conf) to 100 and repeat
import and your test.

Regards
Pavel Stehule

On 28/01/2008, Vlad <marchenko(at)gmail(dot)com> wrote:
> Hello,
>
> I wanted to share performance-related test results for Postgresql
> 8.3RC2 and 8.2.6. In both cases we used a freshly imported database
> followed by analyze verbose command. Same server was used for testing
> (2.6.23.14-107.fc8 x86_64) for each versions; both postgreses were
> compiled with "-O3 -mtune=core2 -march=core2 -m64" flags. In both
> cases default configuration was used with increased shared buffers to
> 1Gb (total server ram is 32Gb), increased work and maintenance mem,
> enabled autovacuum, increased default_statistics_target to 100,
> increased effective_cache_size to 20Gb, disabled fsync and increased
> checkpoint_segments. Total size (on disk) of the tables involved in
> the query was around 300Mb.
>
> 1. Freshly imported DB size on disk was about 3% smaller for 8.3
> 2. We ran several tests and found 8.3 generally 10% slower than 8.2.6.
> We took special measures to make sure that no third factors involved
> (no other apps running, all data was cached from disks, etc). Below
> is one of the queries that we used for testing (I anonymized table
> names) along with query plan for both 8.3 and 8.2. The query execution
> plans are the same for both versions, but what we found quite
> interesting is that if we add all the times from each line of 8.2's
> query plan, it roughly adds-up to the total execution time. For 8.3's
> plan each line shows a shorter time, yet resulting in longer total
> runtime. Also, summing 8.3's plan lines doesn't come close to the
> total execution time:
>
> SELECT _."a_id", SUM(_."counter")
> FROM ts.t_c AS _
> LEFT OUTER JOIN ts.t_a AS __1 ON _."a_id"=__1."id"
> LEFT OUTER JOIN ts.t_i AS __3 ON _."i_id"=__3."id"
> LEFT OUTER JOIN ts.t_b AS __2 ON __3."b_id"=__2."id"
> WHERE (_."date_day" >= '2008-01-01 00:00:00' AND _."date_day" <=
> '2008-01-27 23:59:59')
> AND __1."status" IS TRUE
> AND __2."status" IS TRUE
> GROUP BY _."a_id"
>
>
> 8.2.6 QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=134877.04..134899.04 rows=200 width=52) (actual
> time=21517.837..21517.890 rows=47 loops=1)
> -> Hash Join (cost=2450.09..111489.75 rows=550289 width=52)
> (actual time=76.083..7691.579 rows=2593557 loops=1)
> Hash Cond: (_.i_id = __3.id)
> -> Hash Join (cost=19.20..95377.74 rows=934651 width=56)
> (actual time=0.119..4933.928 rows=2596942 loops=1)
> Hash Cond: (_.a_id = __1.id)
> -> Append (cost=0.00..76276.09 rows=2596252 width=56)
> (actual time=0.014..2988.950 rows=2596942 loops=1)
> -> Seq Scan on t_c _ (cost=0.00..21.10 rows=4
> width=56) (actual time=0.001..0.001 rows=0 loops=1)
> Filter: ((date_day >= '2008-01-01'::date)
> AND (date_day <= '2008-01-27'::date))
> -> Seq Scan on t_c_2008_01 _
> (cost=0.00..76254.99 rows=2596248 width=56) (actual
> time=0.011..1979.606 rows=2596942 loops=1)
> Filter: ((date_day >= '2008-01-01'::date)
> AND (date_day <= '2008-01-27'::date))
> -> Hash (cost=18.30..18.30 rows=72 width=4) (actual
> time=0.094..0.094 rows=72 loops=1)
> -> Seq Scan on t_a __1 (cost=0.00..18.30
> rows=72 width=4) (actual time=0.004..0.067 rows=72 loops=1)
> Filter: (status IS TRUE)
> -> Hash (cost=1950.44..1950.44 rows=38436 width=4) (actual
> time=75.931..75.931 rows=59934 loops=1)
> -> Hash Join (cost=57.45..1950.44 rows=38436 width=4)
> (actual time=0.829..54.760 rows=59934 loops=1)
> Hash Cond: (__3.b_id = __2.id)
> -> Seq Scan on t_i __3 (cost=0.00..1263.82
> rows=65282 width=8) (actual time=0.005..16.852 rows=65282 loops=1)
> -> Hash (cost=50.90..50.90 rows=524 width=4)
> (actual time=0.499..0.499 rows=524 loops=1)
> -> Seq Scan on t_b __2 (cost=0.00..50.90
> rows=524 width=4) (actual time=0.004..0.312 rows=524 loops=1)
> Filter: (status IS TRUE)
> Total runtime: 21518.097 ms
>
>
>
> 8.3RC2: QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> HashAggregate (cost=132352.98..132572.98 rows=200 width=52) (actual
> time=24354.972..24355.019 rows=47 loops=1)
> -> Hash Join (cost=2379.09..108954.69 rows=550548 width=52)
> (actual time=76.188..8177.510 rows=2593557 loops=1)
> Hash Cond: (_.i_id = __3.id)
> -> Hash Join (cost=16.20..92904.25 rows=935090 width=56)
> (actual time=0.140..5304.968 rows=2596942 loops=1)
> Hash Cond: (_.a_id = __1.id)
> -> Append (cost=0.00..73796.62 rows=2597473 width=56)
> (actual time=0.043..3272.024 rows=2596942 loops=1)
> -> Seq Scan on t_c _ (cost=0.00..21.55 rows=4
> width=56) (actual time=0.001..0.001 rows=0 loops=1)
> Filter: ((date_day >= '2008-01-01'::date)
> AND (date_day <= '2008-01-27'::date))
> -> Seq Scan on t_c_2008_01 _
> (cost=0.00..73775.07 rows=2597469 width=56) (actual
> time=0.040..2245.209 rows=2596942 loops=1)
> Filter: ((date_day >= '2008-01-01'::date)
> AND (date_day <= '2008-01-27'::date))
> -> Hash (cost=15.30..15.30 rows=72 width=4) (actual
> time=0.091..0.091 rows=72 loops=1)
> -> Seq Scan on t_a __1 (cost=0.00..15.30
> rows=72 width=4) (actual time=0.003..0.061 rows=72 loops=1)
> Filter: (status IS TRUE)
> -> Hash (cost=1882.44..1882.44 rows=38436 width=4) (actual
> time=76.027..76.027 rows=59934 loops=1)
> -> Hash Join (cost=55.45..1882.44 rows=38436 width=4)
> (actual time=0.835..54.576 rows=59934 loops=1)
> Hash Cond: (__3.b_id = __2.id)
> -> Seq Scan on t_i __3 (cost=0.00..1197.82
> rows=65282 width=8) (actual time=0.004..16.096 rows=65282 loops=1)
> -> Hash (cost=48.90..48.90 rows=524 width=4)
> (actual time=0.513..0.513 rows=524 loops=1)
> -> Seq Scan on t_b __2 (cost=0.00..48.90
> rows=524 width=4) (actual time=0.003..0.307 rows=524 loops=1)
> Filter: (status IS TRUE)
> Total runtime: 24355.179 ms
>
>
> Any ideas on what-we-were-doing-wrong are welcomed
>
>
> --
> Vlad
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2008-01-28 22:27:48 Re: 8.3RC2 vs 8.2.6 testing results
Previous Message John Gateley 2008-01-28 22:26:04 Re: Table has duplicate keys, what did I do