Re: Should we optimize the `ORDER BY random() LIMIT x` case?

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Nico Williams <nico(at)cryptonector(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrei Lepikhov <lepihov(at)gmail(dot)com>, wenhui qiu <qiuwenhuifx(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>
Subject: Re: Should we optimize the `ORDER BY random() LIMIT x` case?
Date: 2025-05-20 08:29:30
Message-ID: CAJ7c6TNfwCJ6NgYo4+_i=6Ki=zt-femh=cmH7kA=ULnQCT2m6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Nico,

> > Can the detection of such queries be done by the yacc/bison parser
> > grammar?
>
> Maybe the `sortby` rule could check if the expression is `random()`,
> then `sort_clause` could check if `$3` is a one-item `sortby_list` of
> just `random()` and mark `$$` as special -- this should be cheap, yes?
> We'd still need to check for `LIMIT` somewhere else.

Although partially implementing an optimizer on the parser level is
possible, it doesn't strike me as a right long-term solution. Firstly,
it's a hacky solution because maybe it will work for random() but for
some reason will not work for -random()*2. Secondly, imagine adding a
dozen optimizations like this in the upcoming years and all of them
interacting with each other. Imagine you are the person who has to
maintain this and not break things when adding another optimization.
All in all, this would be a poor design choice in my opinion.

--
Best regards,
Aleksander Alekseev

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-05-20 08:47:32 Re: Regression in statement locations
Previous Message DIPESH DHAMELIYA 2025-05-20 08:14:37 Re: [PATCH] Allow parallelism for plpgsql return expression after commit 556f7b7