From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, vik(dot)fearing(at)2ndquadrant(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: random() (was Re: New GUC to sample log queries) |
Date: | 2018-12-28 08:18:37 |
Message-ID: | alpine.DEB.2.21.1812280808490.32444@lancre |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Tom,
>>> Another idea, which would be a lot less prone to breakage by
>>> add-on code, is to change drandom() and setseed() to themselves
>>> use pg_erand48() with a private seed.
>
>> The pg_erand48 code looks like crumbs from the 70's optimized for 16 bits
>> architectures (which it is probably not, but why not going to 64 bits or
>> 128 bits directly looks like a missed opportunity), its internal state is
>> 48 bits as its name implies, and its period probably around 2**48, which
>> is 2**12 better than the previous case, not an extraordinary achievement.
>
> I can't get terribly excited about rewriting that code. You're arguing
> from a "one size should fit all" perspective,
Not exactly. I'm not claiming that distinguishing parts that require
good random from others is a bad choice. I'm arguing that:
(1) from a software engineering perspective, the PRNG implementation
should hide the underlying generator name and state size.
(2) from a numerical perspective, poor seedings practice should be avoided
when possible.
(3) from a cryptographic perspective, LCG is a poor choice of fast PRNG,
which a quick look at wikipedia (mother of all knowledge) confirms.
(4) the fact that pg historical PRNG choices are well documented is not a
good justification for not improving them.
Better alternatives exist that do not cost much (eg xorshift variants,
WELL...), some of which are optimized for 64 bits architectures.
> which is exactly not the design approach we're using.
> We've already converted security-sensitive PRNG uses to use
> pg_strong_random (or if we haven't, that's a localized bug in any such
> places we missed). What remains are places where we are not so
> concerned about cryptographic strength but rather speed.
I do agree with the speed concern. I'm arguing that better quality at
speed can be achieved with better seeding practices and not using a LCG.
About costs, not counting array accesses:
- lrand48 (48 bits state as 3 uint16) is 29 ops
(10 =, 8 *, 7 +, 4 >>)
- xorshift+ (128 bits state as 2 uint64) is 13 ops
( 5 =, 0 *, 1 +, 3 >>, 4 ^)
- xororshift128+ (idem) is 17 ops
( 6 =, 0 *, 1 +, 5 >>, 3 ^, 2 |, less if rot in hardware)
- WELL512 (512 bits state as 16 uint32) is 38 ops
(11 =, 0 *, 3 +, 7 >>, 10 ^, 4 &)
probably much better, but probably slower than the current version
I'd be of the (debatable) opinion that we could use xororshift128+,
already used by various languages, even if it fails some specialized
tests.
> It does behoove us to ensure that the seed values are unpredictable and
> that a user-controllable seed isn't used for internal operations,
Sure.
> but I don't feel a need for replacing the algorithm.
Hmmm. Does it mean that you would veto any change, even if the speed
concern is addressed (i.e. faster/not slower with better quality)?
> You might argue that the SQL function drandom should be held to a higher
> standard, but we document exactly this same tradeoff for it. Users who
> want cryptographic strength are directed to pgcrypto, leaving the audience
> for drandom being people who want speed.
My point is that speed is not necessary incompatible with better quality.
Better quality should not replace strong random when needed.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Etsuro Fujita | 2018-12-28 08:28:11 | Re: postgres_fdw: oddity in costing aggregate pushdown paths |
Previous Message | Tatsuro Yamada | 2018-12-28 08:17:06 | Re: [HACKERS] CLUSTER command progress monitor |