Re: pgbench - add pseudo-random permutation function

From: David Bowen <dmb0317(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - add pseudo-random permutation function
Date: 2021-03-11 19:06:06
Message-ID: CANO09fgLiT3BfAfjTSrBeg+eQGXbe_AtTVMEBwrqraQMbB=+8w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The algorithm for generating a random permutation with a uniform
distribution across all permutations is easy:
for (i=0; i<n; i++) {
swap a[n-i] with a[rand(n-i+1)]
}

where 0 <= rand(x) < x and a[i] is initially i (see Knuth, Section 3.4.2
Algorithm P)

David Bowen

On Thu, Mar 11, 2021 at 9:32 AM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
wrote:

> On Thu, 11 Mar 2021 at 00:58, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> >
> > Maybe Dean Rasheed can help because of his math background --- CC'ing
> him.
> >
>
> Reading the thread I can see how such a function might be useful to
> scatter non-uniformly random values.
>
> The implementation looks plausible too, though it adds quite a large
> amount of new code. The main thing that concerns me is justifying the
> code. With this kind of thing, it's all too easy to overlook corner
> cases and end up with trivial sequences in certain special cases. I'd
> feel better about that if we were implementing a standard algorithm
> with known pedigree.
>
> Thinking about the use case for this, it seems that it's basically
> designed to turn a set of non-uniform random numbers (produced by
> random_exponential() et al.) into another set of non-uniform random
> numbers, where the non-uniformity is scattered so that the more/less
> common values aren't all clumped together.
>
> I'm wondering if that's something that can't be done more simply by
> passing the non-uniform random numbers through the uniform random
> number generator to scatter them uniformly across some range -- e.g.,
> given an integer n, return the n'th value from the sequence produced
> by random(), starting from some initial seed -- i.e., implement
> nth_random(lb, ub, seed, n). That would actually be pretty
> straightforward to implement using O(log(n)) time to execute (see the
> attached python example), though it wouldn't generate a permutation,
> so it'd need a bit of thought to see if it met the requirements.
>
> Regards,
> Dean
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2021-03-11 19:21:22 Re: New IndexAM API controlling index vacuum strategies
Previous Message Andres Freund 2021-03-11 18:58:47 Re: A qsort template