Re: Performance

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance
Date: 2011-04-27 21:55:36
Message-ID: 4DB890D8.5000604@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tomas Vondra wrote:
> Hmmm, just wondering - what would be needed to build such 'workload
> library'? Building it from scratch is not feasible IMHO, but I guess
> people could provide their own scripts (as simple as 'set up a a bunch
> of tables, fill it with data, run some queries') and there's a pile of
> such examples in the pgsql-performance list.
>

The easiest place to start is by re-using the work already done by the
TPC for benchmarking commercial databases. There are ports of the TPC
workloads to PostgreSQL available in the DBT-2, DBT-3, and DBT-5 tests;
see http://wiki.postgresql.org/wiki/Category:Benchmarking for initial
information on those (the page on TPC-H is quite relevant too). I'd
like to see all three of those DBT tests running regularly, as well as
two tests it's possible to simulate with pgbench or sysbench: an
in-cache read-only test, and a write as fast as possible test.

The main problem with re-using posts from this list for workload testing
is getting an appropriately sized data set for them that stays
relevant. The nature of this sort of benchmark always includes some
notion of the size of the database, and you get different results based
on how large things are relative to RAM and the database parameters.
That said, some sort of systematic collection of "hard queries" would
also be a very useful project for someone to take on.

People show up regularly who want to play with the optimizer in some
way. It's still possible to do that by targeting specific queries you
want to accelerate, where it's obvious (or, more likely, hard but still
straightforward) how to do better. But I don't think any of these
proposed exercises adjusting the caching model or default optimizer
parameters in the database is going anywhere without some sort of
benchmarking framework for evaluating the results. And the TPC tests
are a reasonable place to start. They're a good mixed set of queries,
and improving results on those does turn into a real commercial benefit
to PostgreSQL in the future too.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Sok Ann Yap 2011-04-27 22:34:28 Re: reducing random_page_cost from 4 to 2 to force index scan
Previous Message Joseph Shraibman 2011-04-27 21:11:44 Re: index usage on queries on inherited tables