Re: pgbench--new transaction type

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgbench--new transaction type
Date: 2011-06-19 22:30:39
Message-ID: 4DFE788F.5020704@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I applied Jeff's patch but changed this to address concerns about the
program getting stuck running for too long in the function:

#define plpgsql_loops 512

This would be better named as "plpgsql_batch_size" or something similar
instead, the current name suggests it's how many loops to run which is
confusing.

My main performance concern here was whether this change really matter
so much once a larger number of clients were involved. Some of the
other things you can do to optimize single-client performance aren't as
useful with lots of them. Here's how the improvements in this mode
worked for me on a server with 4 Hyper-Threaded cores (i870);
shared_buffers=256MB, scale=100:

1 client:
-S: 11533
-S -M prepared: 19498
-P: 49547

12 clients, 4 workers:
-S: 56052
-S -M prepared: 82043
-P: 159443

96 clients, 8 workers:
-S: 49940
-S -M prepared: 76099
-P: 137942

I think this is a really nice new workload to demonstrate. One of the
things we tell people is that code works much faster when moved
server-side, but how much faster isn't always easy to show. Having this
mode available lets them see how dramatic that can be quite easily. I
know I'd like to be able to run performance tests for clients of new
hardware using PostgreSQL and tell them something like this: "With
simple clients executing a statement at a time, this server reaches 56K
SELECTs/section. But using server-side functions to execute them in
larger batches it can do 159K".

The value this provides for providing an alternate source for benchmark
load generation, with a very different profile for how it exercises the
server, is good too.

Things to fix in the patch before it would be a commit candidate:

-Adjust the loop size/name, per above
-Reformat some of the longer lines to try and respect the implied right
margin in the code formatting
-Don't include the "plgsql function created." line unless in debugging mode.
-Add the docs. Focus on how this measures how fast the database can
execute SELECT statements using server-side code. An explanation that
the "transaction" block size is 512 is important to share. It also
needs a warning that time based runs ("-T") may have to wait for a block
to finish and go beyond its normally expected end time.
-The word "via" in the "transaction type" output description is probably
not the best choice. Changing to "SELECT only using PL/pgSQL" would
translate better, and follow the standard case use for the name of that
language.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2011-06-19 22:44:35 Re: Make relation_openrv atomic wrt DDL
Previous Message Kevin Grittner 2011-06-19 22:15:07 Re: [WIP] cache estimates, cache access cost