|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|To:||Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>|
|Cc:||Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: CPU costs of random_zipfian in pgbench|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
>> I'm trying to use random_zipfian() for benchmarking of skewed data sets,
>> and I ran head-first into an issue with rather excessive CPU costs.
> If you want skewed but not especially zipfian, use exponential which is
> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute
> the harmonic number, so it depends in the parameter.
Maybe we should drop support for parameter values < 1.0, then. The idea
that pgbench is doing something so expensive as to require caching seems
flat-out insane from here. That cannot be seen as anything but a foot-gun
for unwary users. Under what circumstances would an informed user use
that random distribution rather than another far-cheaper-to-compute one?
> ... This is why I submitted a pseudo-random permutation
> function, which alas does not get much momentum from committers.
TBH, I think pgbench is now much too complex; it does not need more
features, especially not ones that need large caveats in the docs.
(What exactly is the point of having zipfian at all?)
regards, tom lane
|Next Message||Andrew Gierth||2019-02-17 16:19:05||Re: Ryu floating point output patch|
|Previous Message||Tom Lane||2019-02-17 15:56:06||Re: Ryu floating point output patch|