Re: CPU costs of random_zipfian in pgbench

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Georgios Kokolatos <gkokolatos(at)pm(dot)me>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Fabien Coelho <postgresql(dot)org(at)coelho(dot)net>
Subject: Re: CPU costs of random_zipfian in pgbench
Date: 2019-03-23 17:44:35
Message-ID: alpine.DEB.2.21.1903231802440.18811@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Tom,

> I started to look through this, and the more I looked the more unhappy
> I got that we're having this discussion at all. The zipfian support
> in pgbench is seriously over-engineered and under-documented. As an
> example, I was flabbergasted to find out that the end-of-run summary
> statistics now include this:
>
> /* Report zipfian cache overflow */
> for (i = 0; i < nthreads; i++)
> {
> totalCacheOverflows += threads[i].zipf_cache.overflowCount;
> }
> if (totalCacheOverflows > 0)
> {
> printf("zipfian cache array overflowed %d time(s)\n", totalCacheOverflows);
> }
>
> What is the point of that, and if there is a point, why is it nowhere
> mentioned in pgbench.sgml?

Indeed, there should.

> What would a user do with this information, and how would they know what
> to do?

Sure, but it was unclear what to do. Extending the cache to avoid that
would look like over-engineering.

> I remain of the opinion that we ought to simply rip out support for
> zipfian with s < 1.

Some people really want zipfian because it reflects their data access
pattern, possibly with s < 1.

We cannot helpt it: real life seems zipfianly distributed:-)

> It's not useful for benchmarking purposes to have a random-number
> function with such poor computational properties.

This is mostly a startup cost, the generation cost when a bench is running
is reasonable. How to best implement the precomputation is an open
question.

As a reviewer I was not thrilled by the cache stuff, but I had no better
idea that would not fall under "over-over engineering" or the like.

Maybe it could error out and say "recompile me", but then someone
would have said "that is unacceptable".

Maybe it could auto extend the cache, but that is still more
unnecessary over-engineering, IMHO.

Maybe a there could be some mandatory declarations or partial eval that
could precompute the needed parameters out/before the bench is started,
with a clear message "precomputing stuff...", but that would be over over
over engineering again... and that would mean restricting random_zipfian
parameters to near-constants, which would require some explaining, but
maybe it is an option. I guess that in the paper original context, the
parameters (s & n) are known before the bench is started, so that the
needed value are computed offline once and for all.

> I think leaving it in there is just a foot-gun: we'd be a lot better off
> throwing an error that tells people to use some other distribution.

When s < 1, the startup cost is indeed a pain. However, it is a pain
prescribed by a Turing Award.

> Or if we do leave it in there, we for sure have to have documentation
> that *actually* explains how to use it, which this patch still doesn't.

I'm not sure what explaining there could be about how to use it: one calls
the function to obtain pseudo-random integers with the desired
distribution?

> There's nothing suggesting that you'd better not use a large number of
> different (n,s) combinations.

Indeed, there is no caveat about this point, as noted above.

Please find an updated patch for the documentation, pointing out the
existence of the cache and an advice not to overdo it.

It does not solve the underlying problem which raised your complaint, but
at least it is documented.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-03-23 18:11:44 Re: CPU costs of random_zipfian in pgbench
Previous Message Tom Lane 2019-03-23 17:01:23 Re: CPU costs of random_zipfian in pgbench