From: | Greg Smith <greg(at)2ndQuadrant(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pgbench--new transaction type |
Date: | 2011-06-19 22:30:39 |
Message-ID: | 4DFE788F.5020704@2ndQuadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I applied Jeff's patch but changed this to address concerns about the
program getting stuck running for too long in the function:
#define plpgsql_loops 512
This would be better named as "plpgsql_batch_size" or something similar
instead, the current name suggests it's how many loops to run which is
confusing.
My main performance concern here was whether this change really matter
so much once a larger number of clients were involved. Some of the
other things you can do to optimize single-client performance aren't as
useful with lots of them. Here's how the improvements in this mode
worked for me on a server with 4 Hyper-Threaded cores (i870);
shared_buffers=256MB, scale=100:
1 client:
-S: 11533
-S -M prepared: 19498
-P: 49547
12 clients, 4 workers:
-S: 56052
-S -M prepared: 82043
-P: 159443
96 clients, 8 workers:
-S: 49940
-S -M prepared: 76099
-P: 137942
I think this is a really nice new workload to demonstrate. One of the
things we tell people is that code works much faster when moved
server-side, but how much faster isn't always easy to show. Having this
mode available lets them see how dramatic that can be quite easily. I
know I'd like to be able to run performance tests for clients of new
hardware using PostgreSQL and tell them something like this: "With
simple clients executing a statement at a time, this server reaches 56K
SELECTs/section. But using server-side functions to execute them in
larger batches it can do 159K".
The value this provides for providing an alternate source for benchmark
load generation, with a very different profile for how it exercises the
server, is good too.
Things to fix in the patch before it would be a commit candidate:
-Adjust the loop size/name, per above
-Reformat some of the longer lines to try and respect the implied right
margin in the code formatting
-Don't include the "plgsql function created." line unless in debugging mode.
-Add the docs. Focus on how this measures how fast the database can
execute SELECT statements using server-side code. An explanation that
the "transaction" block size is 512 is important to share. It also
needs a warning that time based runs ("-T") may have to wait for a block
to finish and go beyond its normally expected end time.
-The word "via" in the "transaction type" output description is probably
not the best choice. Changing to "SELECT only using PL/pgSQL" would
translate better, and follow the standard case use for the name of that
language.
--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2011-06-19 22:44:35 | Re: Make relation_openrv atomic wrt DDL |
Previous Message | Kevin Grittner | 2011-06-19 22:15:07 | Re: [WIP] cache estimates, cache access cost |