Re: pgbench - add pseudo-random permutation function

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - add pseudo-random permutation function
Date: 2021-03-12 09:43:59
Message-ID: alpine.DEB.2.22.394.2103121031420.599618@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello Dean,

> The implementation looks plausible too, though it adds quite a large
> amount of new code.

A significant part of this new code the the multiply-modulo
implementation, which can be dropped if we assume that the target has
int128 available, and accept that the feature is not available otherwise.
Also, there are quite a lot of comments which add to the code length.

> The main thing that concerns me is justifying the code. With this kind
> of thing, it's all too easy to overlook corner cases and end up with
> trivial sequences in certain special cases. I'd feel better about that
> if we were implementing a standard algorithm with known pedigree.

Yep. I did not find anything convincing with the requirements: generate a
permutation, can be parametric, low constant cost, good quality, work on
arbitrary sizes…

> Thinking about the use case for this, it seems that it's basically
> designed to turn a set of non-uniform random numbers (produced by
> random_exponential() et al.) into another set of non-uniform random
> numbers, where the non-uniformity is scattered so that the more/less
> common values aren't all clumped together.

Yes.

> I'm wondering if that's something that can't be done more simply by
> passing the non-uniform random numbers through the uniform random
> number generator to scatter them uniformly across some range -- e.g.,
> given an integer n, return the n'th value from the sequence produced
> by random(), starting from some initial seed -- i.e., implement
> nth_random(lb, ub, seed, n). That would actually be pretty
> straightforward to implement using O(log(n)) time to execute (see the
> attached python example), though it wouldn't generate a permutation,
> so it'd need a bit of thought to see if it met the requirements.

Indeed, this violates two requirements: constant cost & permutation.

--
Fabien.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-03-12 10:36:50 Re: [HACKERS] Custom compression methods
Previous Message Dilip Kumar 2021-03-12 09:39:22 Re: [HACKERS] Custom compression methods