Re: Should we optimize the `ORDER BY random() LIMIT x` case?

From: Vik Fearing <vik(at)postgresfriends(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
Subject: Re: Should we optimize the `ORDER BY random() LIMIT x` case?
Date: 2025-05-16 21:10:49
Message-ID: c724e28b-3888-4e8a-8187-a5802d226f2d@postgresfriends.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 16/05/2025 15:01, Tom Lane wrote:
> Aleksander Alekseev <aleksander(at)timescale(dot)com> writes:
>> If I'm right about the limitations of aggregate functions and SRFs
>> this leaves us the following options:
>> 1. Changing the constraints of aggregate functions or SRFs. However I
>> don't think we want to do it for such a single niche scenario.
>> 2. Custom syntax and a custom node.
>> 3. To give up
> Seems to me the obvious answer is to extend TABLESAMPLE (or at least, some
> of the tablesample methods) to allow it to work on a subquery.

Isn't this a job for <fetch first clause>?

Example:

SELECT ...
FROM ... JOIN ...
FETCH SAMPLE FIRST 10 ROWS ONLY

Then the nodeLimit could do some sort of reservoir sampling.

There are several enhancements to <fetch first clause> coming down the
pipe, this could be one of them.

--

Vik Fearing

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-05-16 21:21:08 Re: Should we optimize the `ORDER BY random() LIMIT x` case?
Previous Message Tom Lane 2025-05-16 20:58:05 Re: PG 17.2 compilation fails with -std=c11 on mac