Re: CPU costs of random_zipfian in pgbench

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Georgios Kokolatos <gkokolatos(at)pm(dot)me>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: CPU costs of random_zipfian in pgbench
Date: 2019-03-24 14:12:58
Message-ID: alpine.DEB.2.21.1903241503140.9939@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

>>>>> What is the point of that, and if there is a point, why is it nowhere
>>>>> mentioned in pgbench.sgml?
>>> The attached patch simplifies the code by erroring on cache overflow,
>>> instead of the LRU replacement strategy and unhelpful final report.
>>> The above lines are removed.
> Eh? Do I understand correctly that pgbench might start failing after
> some random amount of time, instead of reporting the overflow at the
> end?

Indeed, that what this patch would induce, although very probably under a
*short* random amount of time.

> I'm not sure that's really an improvement ...

Depends. If the cache is full it means repeating the possibly expensive
constant computations, which looks like a very bad idea for the user

> Why is the cache fixed-size at all?

The cache can diverge and the search is linear, so it does not seem a good
idea to keep it for very long:

\set size random(100000, 1000000)
\set i random_zipfian(1, :size, ...)

The implementation only makes some sense if there are very few values
(param & size pairs with param < 1) used in the whole script.

Tom is complaining of over engineering, and he has a point, so I'm trying
to simplify (eg dropping LRU and erroring) for cases where the feature is
not really appropriate anyway.


In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2019-03-24 14:34:33 Re: CPU costs of random_zipfian in pgbench
Previous Message Greg Steiner 2019-03-24 13:41:01 Re: Error message inconsistency