Re: random() (was Re: New GUC to sample log queries)

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, vik(dot)fearing(at)2ndquadrant(dot)com, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: random() (was Re: New GUC to sample log queries)
Date: 2018-12-28 08:18:37
Message-ID: alpine.DEB.2.21.1812280808490.32444@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Hello Tom,

>>> Another idea, which would be a lot less prone to breakage by
>>> add-on code, is to change drandom() and setseed() to themselves
>>> use pg_erand48() with a private seed.
>> The pg_erand48 code looks like crumbs from the 70's optimized for 16 bits
>> architectures (which it is probably not, but why not going to 64 bits or
>> 128 bits directly looks like a missed opportunity), its internal state is
>> 48 bits as its name implies, and its period probably around 2**48, which
>> is 2**12 better than the previous case, not an extraordinary achievement.
> I can't get terribly excited about rewriting that code. You're arguing
> from a "one size should fit all" perspective,

Not exactly. I'm not claiming that distinguishing parts that require
good random from others is a bad choice. I'm arguing that:

(1) from a software engineering perspective, the PRNG implementation
should hide the underlying generator name and state size.

(2) from a numerical perspective, poor seedings practice should be avoided
when possible.

(3) from a cryptographic perspective, LCG is a poor choice of fast PRNG,
which a quick look at wikipedia (mother of all knowledge) confirms.

(4) the fact that pg historical PRNG choices are well documented is not a
good justification for not improving them.

Better alternatives exist that do not cost much (eg xorshift variants,
WELL...), some of which are optimized for 64 bits architectures.

> which is exactly not the design approach we're using.

> We've already converted security-sensitive PRNG uses to use
> pg_strong_random (or if we haven't, that's a localized bug in any such
> places we missed). What remains are places where we are not so
> concerned about cryptographic strength but rather speed.

I do agree with the speed concern. I'm arguing that better quality at
speed can be achieved with better seeding practices and not using a LCG.

About costs, not counting array accesses:

- lrand48 (48 bits state as 3 uint16) is 29 ops
(10 =, 8 *, 7 +, 4 >>)
- xorshift+ (128 bits state as 2 uint64) is 13 ops
( 5 =, 0 *, 1 +, 3 >>, 4 ^)
- xororshift128+ (idem) is 17 ops
( 6 =, 0 *, 1 +, 5 >>, 3 ^, 2 |, less if rot in hardware)
- WELL512 (512 bits state as 16 uint32) is 38 ops
(11 =, 0 *, 3 +, 7 >>, 10 ^, 4 &)
probably much better, but probably slower than the current version

I'd be of the (debatable) opinion that we could use xororshift128+,
already used by various languages, even if it fails some specialized

> It does behoove us to ensure that the seed values are unpredictable and
> that a user-controllable seed isn't used for internal operations,


> but I don't feel a need for replacing the algorithm.

Hmmm. Does it mean that you would veto any change, even if the speed
concern is addressed (i.e. faster/not slower with better quality)?

> You might argue that the SQL function drandom should be held to a higher
> standard, but we document exactly this same tradeoff for it. Users who
> want cryptographic strength are directed to pgcrypto, leaving the audience
> for drandom being people who want speed.

My point is that speed is not necessary incompatible with better quality.
Better quality should not replace strong random when needed.


In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2018-12-28 08:28:11 Re: postgres_fdw: oddity in costing aggregate pushdown paths
Previous Message Tatsuro Yamada 2018-12-28 08:17:06 Re: [HACKERS] CLUSTER command progress monitor