Re: rand48 replacement

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: rand48 replacement
Date: 2021-11-28 09:26:57
Message-ID: alpine.DEB.2.22.394.2111280837520.223291@pseudo
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hello,

>> They are not more nor less relevant than any other "random" constant. The
>> state needs a default initialization. The point of using DK's is that it
>> is somehow cannot be some specially crafted value which would have some
>> special property only know to the purveyor of the constant and could be
>> used by them to break the algorithm.
>
> Well, none of that is in the comment, which is probably just as well
> because it reads like baseless paranoia.

Sure. Welcome to cryptography:-)

> *Any* initialization vector should be as good as any other; if it's not,
> that's an algorithm fault.

Yep.

> (OK, I'll give it a pass for zeroes being bad, but otherwise not.)

Ok. We can use any non-zero constant. What's wrong with constants provided
by a Turing award computer scientist? I find them more attractive that
some stupid 0x0123456789….

>>> * Function names like these convey practically nothing to readers:
>>>
>>> +extern int64 pg_prng_i64(pg_prng_state *state); [...]
>
>> The intention is obviously "postgres pseudo-random number generator for
>> <type>". [...]
>
>> What would you suggest?
>
> We have names for these types, and those abbreviations are (mostly)
> not them. Name-wise I'd be all right with pg_prng_int64 and so on,

Ok. You prefer "uint64" to "u64".

> but I still expect that these functions' header comments should be
> explicit about uniformity and about the precise output range.

Ok.

> As an example, it's far from obvious whether the minimum value
> of pg_prng_int32 should be zero or INT_MIN.
> (Actually, I suspect you ought to provide both of those cases.)

I agree that it is not obvious. I added "p" for "positive" variants. I
found one place where one could be used.

> And the output range of pg_prng_float8 is not merely unobvious, but not
> very easy to deduce from examining the code either; not that users
> should have to.

Ok.

>>> BTW, why are we bothering with FIRST_BIT_MASK in the first place,
>>> rather than returning "v & 1" for pg_prng_bool?
>
>> Because some PRNG are very bad in the low bits, not xoroshiro stuff,
>> though.
>
> Good, but then you shouldn't write associated code as if that's still
> a problem, because you'll cause other people to think it's still a
> problem and write equally contorted code elsewhere. "v & 1" is a
> transparent way of producing a bool, while this code isn't.

"v & 1" really produces an integer, not a bool. I'd prefer to actually
generate a boolean and let the compiler optimizer do the cleaning.

Some Xoshiro-family generators have "linear artifacts in the low bits",
Although Xoroshiro128** is supposed to be immune, I thought better to keep
away from these, and I could not see why the last bit would be better than
any other bit, so taking the first looked okay to me at least.

I think that the attached v18 addresses most of your concerns.

--
Fabien.

Attachment Content-Type Size
prng-18.patch text/x-diff 67.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-11-28 11:51:35 Re: enhance pg_log_backend_memory_contexts() to log memory contexts of auxiliary processes
Previous Message Peter Smith 2021-11-28 07:17:52 Re: row filtering for logical replication