Re: pgbench - add pseudo-random permutation function

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Hironobu SUZUKI <hironobu(at)interdb(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgbench - add pseudo-random permutation function
Date: 2021-03-31 13:06:13
Message-ID: CAEZATCUMbvg0xOsi1+J1LCV0yz+r8MNESf4sEy++Z9V8GPEO3Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 31 Mar 2021 at 09:02, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> wrote:
>
> >> First, I have a thing against erand48.
> >
> Also, there is a 64 bits seed provided to the function which instantly
> ignores 16 of them, which looks pretty silly to me.
>

Yeah, that was copied from set_random_seed().

> At least, I suggest that two 48-bits prng could be initialized with parts
> of the seed and used in different places, eg for r & m.
>

That could work. I'd certainly feel better about that than
implementing a whole new PRNG.

> Also, the seed could be used to adjust the rotation, maybe.
>

Perhaps. I'm not sure it's really necessary though.

> >> I'm really at odds with FULL SHIFT 1, because it means that up to 1/256 of
> >> values are kept out of STEERING. [...]
> >
> > Ah, that's a good point. Something else that also concerned me there was
> > that it might lead to 2 consecutive full shifts with nothing in between,
> > which would lead to less uniform randomness (like the Irwin-Hall
> > distribution). I just did a quick test without the first full shift, and
> > the results do appear to be better,
>
> Indeed, it makes sense to me.
>

OK, attached is an update making this change and simplifying the
rotate code, which hopefully just leaves the question of what (if
anything) to do with pg_erand48().

> >> Third, I think that the rotate code can be simplified, in particular
> >> the ?: should be avoided because it may induce branches quite damaging
> >> to processor performance.
> >
> > Yeah, I wondered about that. Perhaps there's a "trick" that can be
> > used to simplify it. Pre-computing the number of bits in the mask
> > would probably help.
>
> See pg_popcount64().
>

Actually, I used pg_leftmost_one_pos64() to calculate the mask length,
allowing the mask to be computed from that, so there is no longer a
need for compute_mask(), which seems like a neat little
simplification.

Regards,
Dean

Attachment Content-Type Size
pgbench-prp-func-25.patch text/x-patch 15.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2021-03-31 13:06:43 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Amit Langote 2021-03-31 12:54:04 Re: making update/delete of inheritance trees scale better