|From:||Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>|
|To:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>|
|Cc:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Georgios Kokolatos <gkokolatos(at)pm(dot)me>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>|
|Subject:||Re: CPU costs of random_zipfian in pgbench|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Hello Tom & Tomas,
>> If the choice is between reporting the failure to the user, and
>> addressing the failure, surely the latter would be the default option?
>> Particularly if the user can't really address the issue easily
>> (recompiling psql is not very practical solution).
>>>> I remain of the opinion that we ought to simply rip out support for
>>>> zipfian with s < 1.
>> +1 to that
> If this is done, some people with zipfian distribution that currently
> work might be unhappy.
After giving it some thought, I think that this cannot be fully fixed for
The attached patch removes the code for param in (0, 1), and slightly
improve the documentation about the performance, if you want to proceed.
For s > 1, there is no such constraint, and it works fine, there is no
reason to remove it.
Given the constraint of Jim Gray's approximated method for s in (0, 1),
which really does zipfian for the first two integers and then uses an
exponential approximation, the only approach is that the parameters must
be computed in a partial eval preparation phase before the bench code is
run. This means that only (mostly) constants would be allowed as
parameters when s is in (0, 1), but I think that this is acceptable
because anyway the method fundamentaly requires it. I think that it can be
implemented reasonably well (meaning not too much code), but would
requires a few round of reviews if someone implements it (for a reminder,
I was only the reviewer on this one). An added benefit would be that the
parameter cache could be shared between thread, which would be a good
The attached other attached patch illustrate what I call poor performance
for stupid parameters (no point in doing zipfian on 2 integers…) :
./pgbench -T 3 -D n=2 -D s=1.01 -f zipf_perf.sql # 46981 tps
./pgbench -T 3 -D n=2 -D s=1.001 -f zipf_perf.sql # 6187 tps
./pgbench -T 3 -D n=2 -D s=1.0001 -f zipf_perf.sql # 710 tps
./pgbench -T 3 -D n=100 -D s=1.01 -f zipf_perf.sql # 142910 tps
./pgbench -T 3 -D n=100 -D s=1.001 -f zipf_perf.sql # 21214 tps
./pgbench -T 3 -D n=100 -D s=1.0001 -f zipf_perf.sql # 2466 tps
./pgbench -T 3 -D n=1000000 -D s=1.01 -f zipf_perf.sql # 376453 tps
./pgbench -T 3 -D n=1000000 -D s=1.001 -f zipf_perf.sql # 57441 tps
./pgbench -T 3 -D n=1000000 -D s=1.0001 -f zipf_perf.sql # 6780 tps
Maybe the implementation could impose that s is at least 1.001 to avoid
the lower performance?
|Next Message||Tom Lane||2019-03-24 18:06:59||Re: warning to publication created and wal_level is not set to logical|
|Previous Message||David Fetter||2019-03-24 17:54:48||Re: warning to publication created and wal_level is not set to logical|