From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Undocumented Order By vs Target List Volatile Function Behavior |
Date: | 2022-07-21 20:20:01 |
Message-ID: | CAKFQuwZ3-XGfcS+CLTAYvPx3ARYjUxv+=YL8sOicV0nda=T5cA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hey,
This came up today on twitter as a claimed POLA violation:
postgres=# select random(), random() order by random();
random | random
---------------------+---------------------
0.08176638503720679 | 0.08176638503720679
(1 row)
Which was explained long ago by Tom as:
https://www.postgresql.org/message-id/9570.1193941378%40sss.pgh.pa.us
The parser makes it behave equivalent to:
SELECT random() AS foo ORDER BY foo;
Which apparently extends to any column, even aliased ones, that use the
same expression:
postgres=# select random() as foo, random() as foo2 order by foo;
foo | foo2
--------------------+--------------------
0.7334292196943459 | 0.7334292196943459
(1 row)
The documentation does say:
"A query using a volatile function will re-evaluate the function at every
row where its value is needed."
https://www.postgresql.org/docs/current/xfunc-volatility.html
That sentence is insufficient to explain why, without the order by, the
system chooses to evaluate random() twice, while with order by it does so
only once.
I propose extending the existing ORDER BY paragraph in the SELECT Command
Reference as follows:
"A limitation of this feature is that an ORDER BY clause applying to the
result of a UNION, INTERSECT, or EXCEPT clause can only specify an output
column name or number, not an expression."
Add:
A side-effect of this feature is that ORDER BY expressions containing
volatile functions will execute the volatile function only once for the
entire row; thus any column expressions using the same function will reuse
the same function result. By way of example, note the output differences
for the following two queries:
postgres=# select random() as foo, random()*1 as foo2 from
generate_series(1,2) order by foo;
foo | foo2
--------------------+--------------------
0.2631492904302788 | 0.2631492904302788
0.9019166692448664 | 0.9019166692448664
(2 rows)
postgres=# select random() as foo, random() as foo2 from
generate_series(1,2);
foo | foo2
--------------------+--------------------
0.7763978178239725 | 0.3569212477832773
0.7360531822096732 | 0.7028952103643864
(2 rows)
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2022-07-21 20:32:38 | Re: System column support for partitioned tables using heap |
Previous Message | Jonathan S. Katz | 2022-07-21 20:08:54 | Re: Handle infinite recursion in logical replication setup |