Re: pgbench - add pseudo-random permutation function

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - add pseudo-random permutation function
Date: 2021-04-02 05:38:57
Message-ID: alpine.DEB.2.22.394.2104011103460.702551@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>> r = (uint64) (pg_erand48(random_state.xseed) * size);
>>
>> I do not understand why the random values are multiplied by anything in
>> the first place…
>
> These are just random integers in the range [0,mask] and [0,size-1],
> formed in exactly the same way as getrand().

Indeed, erand returns a double, this was the part I was missing. I did not
realize that you had switched to doubles in your approach.

I think that permute should only use integer operations. I'd suggest to
use one of the integer variants instead of going through a double
computation and casting back to int. The internal state is based on
integers, I do not see the added value of going through floats, possibly
enduring floating point issues (undeflow, rounding, normalization,
whatever) on the way, whereas from start to finish we just need ints.

See attached v27 proposal.

I still think that *rand48 is a poor (relatively small state) and
inefficient (the implementation includes packing and unpacking 16 bits
ints to build a 64 bits int) choice.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2021-04-02 05:39:20 Re: Fix pg_checksums progress report
Previous Message Julien Rouhaud 2021-04-02 05:33:28 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?