Quick Links

Re: PATCH: pgbench - random sampling of transaction written into log

From:	Tomas Vondra <tv(at)fuzzy(dot)cz>
To:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: PATCH: pgbench - random sampling of transaction written into log
Date:	2012-08-26 00:48:39
Message-ID:	50397267.3060405@fuzzy.cz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 26.8.2012 00:19, Jeff Janes wrote:
> On Fri, Aug 24, 2012 at 2:16 PM, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>> Hi,
>>
>> attached is a patch that adds support for random sampling in pgbench, when
>> it's executed with "-l" flag. You can do for example this:
>>
>> $ pgbench -l -T 120 -R 1 db
>>
>> and then only 1% of transactions will be written into the log file. If you
>> omit the tag, all the transactions are written (i.e. it's backward
>> compatible).
>
> Hi Tomas,
>
> You use the rand() function. Isn't that function not thread-safe?
> Or, if it is thread-safe, does it accomplish that with a mutex? That
> was a problem with a different rand function used in pgbench that
> Robert Haas fixed a while ago, 4af43ee3f165c8e4b332a7e680.

Hi Jeff,

Aha! Good catch. I've used rand() which seems to be neither reentrant or
thread-safe (unless the man page is incorrect). Anyway, using pg_erand48
or getrand seems like an appropriate fix.

> Also, what benefit is had by using modulus on rand(), rather than just
> modulus on an incrementing counter?

Hmm, I was thinking about that too, but I wasn't really sure how would
that behave with multiple SQL files etc. But now I see the files are
actually chosen randomly, so using a counter seems like a good idea.

> Could you explain the value of this patch, given your other one that
> does aggregation? If both were accepted, I think I would always use
> the aggregation one in preference to this one.

The difference is that the sample contains information that is simply
unavailable in the aggregated output. For example when using multiple
files, you can compute per-file averages from the sample, but the
aggregated output contains just a single line for all files combined.

Tomas

In response to

Re: PATCH: pgbench - random sampling of transaction written into log at 2012-08-25 22:19:24 from Jeff Janes

Responses

Re: PATCH: pgbench - random sampling of transaction written into log at 2012-08-26 17:04:35 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2012-08-26 02:48:59	Re: Timing overhead and Linux clock sources
Previous Message	Bruce Momjian	2012-08-25 23:12:47	Re: psql \set vs \copy - bug or expected behaviour?