Re: random() (was Re: New GUC to sample log queries)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Vik Fearing <vik(dot)fearing(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: random() (was Re: New GUC to sample log queries)
Date: 2018-12-26 19:31:06
Message-ID: 5659.1545852666@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Geoghegan <pg(at)bowt(dot)ie> writes:
> On Wed, Dec 26, 2018 at 10:45 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I wonder whether we should establish a project policy to avoid use
>> of random() for internal purposes, ie try to get to a point where
>> drandom() is the only caller in the backend. A quick grep says
>> that there's a dozen or so callers, so this patch certainly isn't
>> the only offender ... but should we make an effort to convert them
>> all to use, say, pg_erand48()? I think all the existing callers
>> could happily share a process-wide random state, so we could make
>> a wrapper that's no harder to use than random().

> I've used setseed() to make nbtree's "getting tired" behavior
> deterministic for specific test cases I've developed -- the random()
> choice of whether to split a page full of duplicates, or continue
> right in search of free space becomes predictable. I've used this to
> determine whether my nbtree patch's pg_upgrade'd indexes have
> precisely the same behavior as v3 indexes on the master branch
> (precisely the same in terms of the structure of the final index
> following a bulk load).

TBH, I'd call it a bug --- maybe even a low-grade security hazard
--- that it's possible to affect that from user level.

In fact, contemplating that for a bit: it is possible, as things
stand in HEAD, for a user to control which of his statements will
get logged if the DBA has enabled log_statement_sample_rate.
It doesn't take a lot of creativity to think of ways to abuse that.
So maybe Coverity had the right idea to start with.

There might well be debugging value in affecting internal PRNG usages,
but let's please not think it's a good idea that that's trivially
reachable from SQL.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2018-12-26 19:45:47 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Previous Message Tom Lane 2018-12-26 19:21:05 Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)