Quick Links

Re: Should we optimize the `ORDER BY random() LIMIT x` case?

From:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
To:	Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc:	wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
Subject:	Re: Should we optimize the `ORDER BY random() LIMIT x` case?
Date:	2025-05-15 09:32:35
Message-ID:	129260c2-6c02-4a49-9143-091f6bf81cd6@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 15/5/2025 11:17, Aleksander Alekseev wrote:
>> What kind of optimisation trick may the optimiser use here to provide an
>> optimal plan? As I see it, it will need to think that all the tuples
>> should be returned from the subquery. The only profit is to skip sorting
>> the massive sample.
>
> Doesn't look like a generic optimization trick will help us. I was
> thinking about a custom aggregate function, e.g. `SELECT sample(*, 10)
> ...`. However I doubt that aggregate functions are flexible enough. Or
> alternatively a rewrite rule. I never dealt with those before so I
> have no idea what I'm talking about :D
A custom SRF seems great to me. You may propose such an aggregate in the
core - it seems it doesn't even need any syntax changes. For example:
SELECT * FROM (SELECT sample(q, 10, <type>) FROM (SELECT ...) AS q);
or something like that.

--
regards, Andrei Lepikhov

In response to

Re: Should we optimize the `ORDER BY random() LIMIT x` case? at 2025-05-15 09:17:53 from Aleksander Alekseev

Responses

Re: Should we optimize the `ORDER BY random() LIMIT x` case? at 2025-05-16 12:01:04 from Aleksander Alekseev

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2025-05-15 09:44:29	Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables
Previous Message	Dilip Kumar	2025-05-15 09:20:30	Re: Assertion failure in smgr.c when using pg_prewarm with partitioned tables