Re: PostgreSQL TPC-H test result?

From: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)Sun(dot)COM>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL TPC-H test result?
Date: 2008-09-12 03:30:39
Message-ID: 48C9E25F.4070203@sun.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

Moving this thread to Performance alias as it might make more sense for
folks searching on this topic:

Greg Smith wrote:
> On Tue, 9 Sep 2008, Amber wrote:
>
>> I read something from
>> http://monetdb.cwi.nl/projects/monetdb/SQL/Benchmark/TPCH/index.html
>> saying that PostgreSQL can't give the correct result of the some
>> TPC-H queries
>
> Jignesh Shah at Sun ran into that same problem. It's mentioned
> briefly in his presentation at
> http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_postgresql
> on pages 26 and 27. 5 of the 22 reference TCP-H queries (4, 5, 6, 10,
> 14) returned zero rows immediately for his tests. Looks like the
> MonetDB crew is saying it does that on queries 4,5,6,10,12,14,15 and
> that 20 takes too long to run to generate a result. Maybe 12/15/20
> were fixed by changes in 8.3, or perhaps there were subtle errors
> there that Jignesh didn't catch--it's not like he did a formal
> submission run, was just kicking the tires. I suspect the difference
> on 20 was that his hardware and tuning was much better, so it probably
> did execute fast enough.

I redid a quick test with the same workload on one of my systems with SF
10 which is about 10GB
(I hope it comes out properly displayed)

Jignesh From Monet (8.3T/8.2.9)

Q Time PG8.3.3 Time PG8.2.9 Ratio

1 429.01 510 0.84

2 3.65 54 0.07

3 33.49 798 0.04

4 6.53 Empty 35 (E) 0.19

5 8.45 Empty 5.5(E) 1.54

6 32.84 Empty 172 (E) 0.19

7 477.95 439 1.09

8 58.55 251 0.23

9 781.96 2240 0.35

10 9.03 Empty 6.1(E) 1.48

11 3.57 Empty 25 0.14

12 56.11 Empty 179 (E) 0.31

13 61.01 140 0.44

14 30.69 Empty 169 (E) 0.18

15 32.81 Empty 168 (E) 0.2

16 23.98 115 0.21

17 Did not finish Did not finish

18 58.93 882 0.07

19 71.55 218 0.33

20 Did not finish Did not finish

21 550.51 477 1.15

22 6.21 Did not finish

All time is in seconds (sub seconds where availabe)
Ratio > 1 means 8.3.3 is slower and <1 means 8.3.3 is faster

My take on the results:

* I had to tweak the statement of Q1 in order to execute it.
(TPC-H kit does not directly support POSTGRESQL statements)

* Timings with 8.3.3 and bit of tuning gives much better time overall
This was expected (Some queries finish in 7% of the time than what
MonetDB reported. From the queries that worked only Q7 & Q21 seem to
have regressed)

* However Empty rows results is occuring consistently
(Infact Q11 also returned empty for me while it worked in their test)
Queries: 4,5,6,10,11,12,14,15
(ACTION ITEM: I will start separate threads for each of those queries in
HACKERS alias to figure out the problem since it looks like Functional
problem to me and should be interesting to hackers alias)

* Two queries 17,20 looks like will not finish (I let Q17 to run for 18
hrs and
yet it had not completed. As for Q20 I killed it as it was approaching
an hour.)
(ACTION ITEM: Not sure whether debugging for these queries will go in
hackers or
perform alias but I will start a thread on them too.)

* Looks like bit of tuning is required for Q1, Q7, Q9, Q21 to improve their
overall time. Specially understanding if PostgreSQL is missing a more
efficient
plan for them.
(ACTION ITEM: I will start separate threads on performance alias to
dig into
those queries)

I hope to start separate threads for each queries so we can track them
easier. I hope to provide explain analyze outputs for each one of them
and lets see if there are any problems.

Feedback welcome on what you want to see for each threads.

Regards,
Jignesh

--
Jignesh Shah http://blogs.sun.com/jkshah
Sun Microsystems,Inc http://sun.com/postgresql

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2008-09-12 03:32:35 Re: declare column update expression
Previous Message Chris Velevitch 2008-09-12 03:10:41 Re: declare column update expression

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-09-12 03:48:06 Re: [GENERAL] PostgreSQL TPC-H test result?
Previous Message Scott Marlowe 2008-09-12 02:30:38 Re: Effects of setting linux block device readahead size