Re: pgbench cpu overhead (was Re: lazy vxid locks, v1)

From: Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: pgbench cpu overhead (was Re: lazy vxid locks, v1)
Date: 2011-07-24 10:50:37
Message-ID: 4E2BF8FD.6060505@kaltenbrunner.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 07/24/2011 03:50 AM, Jeff Janes wrote:
> On Mon, Jun 13, 2011 at 7:03 AM, Stefan Kaltenbrunner
> <stefan(at)kaltenbrunner(dot)cc> wrote:
>> On 06/13/2011 01:55 PM, Stefan Kaltenbrunner wrote:
>>
>> [...]
>>
>>> all those tests are done with pgbench running on the same box - which
>>> has a noticable impact on the results because pgbench is using ~1 core
>>> per 8 cores of the backend tested in cpu resoures - though I don't think
>>> it causes any changes in the results that would show the performance
>>> behaviour in a different light.
>>
>> actuall testing against sysbench with the very same workload shows the
>> following performance behaviour:
>>
>> with 40 threads(aka the peak performance point):
>>
>> pgbench: 223308 tps
>> sysbench: 311584 tps
>>
>> with 160 threads (backend contention dominated):
>>
>> pgbench: 57075
>> sysbench: 43437
>>
>>
>> so it seems that sysbench is actually significantly less overhead than
>> pgbench and the lower throughput at the higher conncurency seems to be
>> cause by sysbench being able to stress the backend even more than
>> pgbench can.
>>
>>
>> for those curious - the profile for pgbench looks like:
>>
>> samples % symbol name
>> 29378 41.9087 doCustom
>> 17502 24.9672 threadRun
>> 7629 10.8830 pg_strcasecmp
>> 5871 8.3752 compareVariables
>> 2568 3.6633 getVariable
>> 2167 3.0913 putVariable
>> 2065 2.9458 replaceVariable
>> 1971 2.8117 parseVariable
>> 534 0.7618 xstrdup
>> 278 0.3966 xrealloc
>> 137 0.1954 xmalloc
>
> Hi Stefan,
>
> How was this profile generated? I get a similar profile using
> --enable-profiling and gprof, but I find it not believable. The
> complete absence of any calls to libpq is not credible. I don't know
> about your profiler, but with gprof they should be listed in the call
> graph even if they take a negligible amount of time. So I think
> pgbench is linking to libpq libraries that do not themselves support
> profiling (I have no idea how that could happen though). If the calls
> graphs are not getting recorded correctly, surely the timing can't be
> reliable either.

hmm - the profile was generated using oprofile, but now that you are
mentioning this aspect, I suppose that this was a profile run without
opcontrol --seperate=lib...
I'm not currently in a position to retest that - but maybe you could do
a run?

>
> (I also tried profiling pgbench with "perf", but in that case I get
> nothing other than kernel and libc calls showing up. I don't know
> what that means)
>
> To support this, I've dummied up doCustom so that does all the work of
> deciding what needs to happen, executing the metacommands,
> interpolating the variables into the SQL string, but then simply
> refrains from calling the PQ functions to send and receive the query
> and response. (I had to make a few changes around the select loop in
> threadRun to support this).
>
> The result is that the dummy pgbench is charged with only 57% more CPU
> time than the stock one, but it gets over 10 times as many
> "transactions" done. I think this supports the notion that the CPU
> bottleneck is not in pgbench.c, but somewhere in the libpq or the
> kernel.

interesting - iirc we actually had some reports about current libpq
behaviour causing scaling issues on some OSes - see
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00748.php and
some related threads. Iirc the final patch for that was never applied
though and the original author lost interest, I think that I was able to
measure some noticable performance gains back in the days but I don't
think I still have the numbers somewhere.

Stefan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Pihlak 2011-07-24 12:55:03 Re: libpq SSL with non-blocking sockets
Previous Message Bruce Momjian 2011-07-24 05:46:08 Re: Problem with pg_upgrade's directory write check on Windows