From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgbench - add pseudo-random permutation function |
Date: | 2021-03-31 08:02:42 |
Message-ID: | alpine.DEB.2.22.394.2103310849380.411399@pseudo |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Dean,
>> First, I have a thing against erand48.
>
> Yeah, that's probably a fair point. However, all the existing pgbench
> random functions are using it, so I think it's fair enough for permute()
> to do the same (and actually 2^48 is pretty huge). Switching to a 64-bit
> PRNG might not be a bad idea, but I think that's something we'd want to
> do across the board, and so I think it should be out of scope for this
> patch.
But less likely to pass, whereas here we have an internal function that
we can set as we want.
Also, there is a 64 bits seed provided to the function which instantly
ignores 16 of them, which looks pretty silly to me.
Also, the function is named everywhere erand48 with its hardcoded int16[3]
state, which makes a poor abstraction.
At least, I suggest that two 48-bits prng could be initialized with parts
of the seed and used in different places, eg for r & m.
Also, the seed could be used to adjust the rotation, maybe.
>> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
>> values are kept out of STEERING. [...]
>
> Ah, that's a good point. Something else that also concerned me there was
> that it might lead to 2 consecutive full shifts with nothing in between,
> which would lead to less uniform randomness (like the Irwin-Hall
> distribution). I just did a quick test without the first full shift, and
> the results do appear to be better,
Indeed, it makes sense to me.
> so removing that looks like a good idea.
>> Third, I think that the rotate code can be simplified, in particular
>> the ?: should be avoided because it may induce branches quite damaging
>> to processor performance.
>
> Yeah, I wondered about that. Perhaps there's a "trick" that can be
> used to simplify it. Pre-computing the number of bits in the mask
> would probably help.
See pg_popcount64().
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2021-03-31 08:30:41 | Re: Issue with point_ops and NaN |
Previous Message | Michael Paquier | 2021-03-31 07:42:32 | Re: Proposal: Save user's original authenticated identity for logging |