Quick Links

Re: CPU costs of random_zipfian in pgbench

From:	Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Georgios Kokolatos <gkokolatos(at)pm(dot)me>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: CPU costs of random_zipfian in pgbench
Date:	2019-03-24 18:05:49
Message-ID:	alpine.DEB.2.21.1903241823070.9939@lancre
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello Tom & Tomas,

>> If the choice is between reporting the failure to the user, and
>> addressing the failure, surely the latter would be the default option?
>> Particularly if the user can't really address the issue easily
>> (recompiling psql is not very practical solution).
>>
>>>> I remain of the opinion that we ought to simply rip out support for
>>>> zipfian with s < 1.
>>
>> +1 to that
>
> If this is done, some people with zipfian distribution that currently
> work might be unhappy.

After giving it some thought, I think that this cannot be fully fixed for
12.

The attached patch removes the code for param in (0, 1), and slightly
improve the documentation about the performance, if you want to proceed.

For s > 1, there is no such constraint, and it works fine, there is no
reason to remove it.

Given the constraint of Jim Gray's approximated method for s in (0, 1),
which really does zipfian for the first two integers and then uses an
exponential approximation, the only approach is that the parameters must
be computed in a partial eval preparation phase before the bench code is
run. This means that only (mostly) constants would be allowed as
parameters when s is in (0, 1), but I think that this is acceptable
because anyway the method fundamentaly requires it. I think that it can be
implemented reasonably well (meaning not too much code), but would
requires a few round of reviews if someone implements it (for a reminder,
I was only the reviewer on this one). An added benefit would be that the
parameter cache could be shared between thread, which would be a good
thing.

The attached other attached patch illustrate what I call poor performance
for stupid parameters (no point in doing zipfian on 2 integers…) :

./pgbench -T 3 -D n=2 -D s=1.01 -f zipf_perf.sql # 46981 tps
./pgbench -T 3 -D n=2 -D s=1.001 -f zipf_perf.sql # 6187 tps
./pgbench -T 3 -D n=2 -D s=1.0001 -f zipf_perf.sql # 710 tps

./pgbench -T 3 -D n=100 -D s=1.01 -f zipf_perf.sql # 142910 tps
./pgbench -T 3 -D n=100 -D s=1.001 -f zipf_perf.sql # 21214 tps
./pgbench -T 3 -D n=100 -D s=1.0001 -f zipf_perf.sql # 2466 tps

./pgbench -T 3 -D n=1000000 -D s=1.01 -f zipf_perf.sql # 376453 tps
./pgbench -T 3 -D n=1000000 -D s=1.001 -f zipf_perf.sql # 57441 tps
./pgbench -T 3 -D n=1000000 -D s=1.0001 -f zipf_perf.sql # 6780 tps

Maybe the implementation could impose that s is at least 1.001 to avoid
the lower performance?

--
Fabien.

Attachment	Content-Type	Size
pgbench-remove-zipf-below-one-1.patch	text/x-diff	10.3 KB
zipf_perf.sql	application/x-sql	33 bytes

In response to

Re: CPU costs of random_zipfian in pgbench at 2019-03-24 14:34:33 from Fabien COELHO

Responses

Re: CPU costs of random_zipfian in pgbench at 2019-03-24 18:27:46 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2019-03-24 18:06:59	Re: warning to publication created and wal_level is not set to logical
Previous Message	David Fetter	2019-03-24 17:54:48	Re: warning to publication created and wal_level is not set to logical