Postgres Benchmark looking for maintainer

From: PFC <lists(at)peufeu(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Postgres Benchmark looking for maintainer
Date: 2008-04-28 21:38:18
Message-ID: op.uacbp4l9cigqcu@apollo13.peufeu.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

Executive summary :

Last year I wrote a database benchmark which simulates a forum.
It works on Postgres and MySQL.
It could be useful.
I have no time to touch this, so it is rotting on my harddrive.

Who wants to adopt it ? I will put it on pgfoundry.
I can spend a few hours documenting the source and writing some
documentation and pass the package to someone who might be interested and
more available.

Details :

The benchmark is a forum type load (actually it came from me arguing with
the phpBB team, lol) but, unlike all forums I know, "correctly" optimized.
A bunch of forums are created, and there is a website (in PHP), very
basic, which allows you to browse the forums, view topics, and insert
posts. It displays the usual forum info like last post, number of topics
or posts in forum, number of posts in topic, etc.

Then there is a benchmarking client, written in Python. It spawns a number
of "users" who perform real-life actions, like viewing pages, adding
posts, and there a few simulated moderators who will, once in a while,
destroy topics and even forums.

This client can hit the PHP website via HTTP.

However postgres is so fast that you would need several PHP servers to
kill it. So, I added a multi-backend capability to the client : it can hit
the database directly, performing the queries the PHP script would have
performed.

However, postgres is still so fast that you won't be able to benchmark
anything more powerful than a Core 2, the client would need to be
rewritten in a compiled language like Java. Also, retrieving the posts'
text easily blasted the 100 Mbps connection between server and client, so
you would need Gigabit ethernet.

So, the load is very realistic (it would mimic a real forum pretty well) ;
but in order to benchmark it you must simulate humongous traffic levels.

The only difference is that my benchmark does a lot more writing (post
insertions) than a normal forum ; I wanted the database to grow big in a
few hours.

It also works on MySQL so you can get a good laugh. Actually I was able to
extract some good performance out of MySQL, after lots of headaches,
except that I was never able to make it use more than 1 core.

Contrary to the usual benchmarks, the code is optimized for MySQL and for
Postgres, and the stored procedures also. Thus, what is compared is not a
least-common-denominator implementation that happens to work on both
databases, but two implementations specifically targeted and optimized at
each database.

The benchmark is also pretty simple (unlike the TPC) but it is useful,
first it is CPU-bound then IO-bound and clustering the tables does a lot
for performance (you can test auto-cluster), checkpoints are very visible,
etc. So it can provide useful information that is easier to understand
that a very complex benchmark.

Originally the purpose of the benchmark was to test postgres' full search
; the code is still there.

Regards,
Pierre

Browse pgsql-performance by date

  From Date Subject
Next Message Chris Browne 2008-04-28 21:43:57 Re: Replication Syatem
Previous Message Adonias Malosso 2008-04-28 21:37:46 Re: Best practice to load a huge table from ORACLE to PG