Quick Links

Re: Support getrandom() for pg_strong_random() source

From:	Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>
To:	Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc:	Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Daniel Gustafsson <daniel(at)yesql(dot)se>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Support getrandom() for pg_strong_random() source
Date:	2025-07-30 13:25:38
Message-ID:	87y0s5hktp.fsf@wibble.ilmari.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org> writes:

> Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> writes:
>
>> On Tue, Jul 29, 2025 at 8:55 AM Jacob Champion
>> <jacob(dot)champion(at)enterprisedb(dot)com> wrote:
>>>
>>> On Mon, Jul 28, 2025 at 6:30 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>>> > My understanding of the problem is that it is a choice of efficiency
>>> > vs entropy, and that it's not really possible to have both parts of
>>> > the cake.
>>
>> Agreed. I think the optimal choice would depend on the specific use
>> case. For instance, since UUIDs are not intended for security
>> purposes, they don't require particularly high entropy. In UUID
>> generation, the efficiency of random data generation tends to be
>> prioritized over the quality of randomness.
>>
>>>
>>> > Could getentropy() be more efficient at the end on most platforms,
>>> > meaning that this could limit the meaning of having a GUC switch?
>>>
>>> I don't know. [2] implies that the performance comparison depends on
>>> several factors, and falls in favor of OpenSSL when the number of
>>> bytes per call is large -- but our use of pg_strong_random() is
>>> generally on small buffers. We would need to do a _lot_ more research
>>> before, say, switching any defaults.
>>
>> The performance issue with getentropy, particularly when len=1024,
>> likely stems from the need for multiple getentropy() calls due to its
>> 256-byte length restriction.
>>
>> Analysis of RAND_bytes() through strace reveals that it internally
>> makes calls to getrandom() with a fixed length of 32 bytes. While I'm
>> uncertain of the exact purpose, it's logical that a single
>> getentropy() call would be more efficient than RAND_bytes(), which
>> involves additional overhead beyond just calling getrandom(),
>> especially when dealing with smaller byte sizes.
>>
>> I've updated the patch to support getentropy() instead of getrandom().
>
> Thanks, just a few comments:
>
> The blog post at
> https://dotat.at/@/2024-10-01-getentropy.html#portability-of-getentropy-
> points out a couple of caveats:
>
> * Originally getentropy() was declared in <sys/random.h> but POSIX
> declares it in <unistd.h>. You need to include both headers to be
> sure.
>
> So the probes need to include both <sys/random.h> (if avaliable) and
> <unistd.h>,

I realised I got the conditional for this wrong, since
cdata.get('HAVE_SYS_RANDOM_H') can return either the integer 1 or the
boolean false, so it needs to be format()-ed and compared to a string.

Updated patch attached.

- ilmari

Attachment	Content-Type	Size
v4-0001-Support-getentropy-as-source-of-pg_strong_random-.patch	text/x-diff	6.2 KB

In response to

Re: Support getrandom() for pg_strong_random() source at 2025-07-30 11:50:49 from Dagfinn Ilmari Mannsåker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2025-07-30 13:34:58	Re: Improve prep_buildtree
Previous Message	Hayato Kuroda (Fujitsu)	2025-07-30 12:33:27	RE: 024_add_drop_pub.pl might fail due to deadlock