Re: BUG #17862: Overall query cost ignores window function

From: Tim Palmer <tim3sp(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17862: Overall query cost ignores window function
Date: 2023-03-23 14:40:55
Message-ID: CAHg=Pn=NH-s-ako3R2OtB67y1VRzZF0sprRBuzMc0=T9+iVZDw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Wed, 22 Mar 2023 at 21:03, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> It likely would be possible to adjust cost_windowagg() to figure out a
> startup_cost for getting the first row from a WindowFunc. Doing so
> would require looking at the frame options and trying to figure out
> how many rows need to be looked at. If you'd written count(*) OVER
> (rows between current row and 10 following) then we'd only need to
> look forward 10 rows from the current row.

The original query is attempting to retrieve one page of data and
(simultaneously) the total number of rows available to be paged through,
like the suggestion here https://stackoverflow.com/a/28888696.
A frame_end of '10 following' wouldn't be useful for that, but there's
probably no point in counting the exact row count beyond some limit
so potentially it could use something like '500 following'.

In the specific case of an empty OVER () clause, could the startup cost
be set to the total cost of the child plan node?

> > I believe this (on a more complicated query) is affecting the plan
chosen by
> > the optimizer.

> I immediately see what alternative plans could be considered and not
> chosen as a result of this. Can you give an example?

As an example, given this table:

CREATE TABLE t AS
SELECT trunc(random() * 100) num1, trunc(random() * 100) num2
FROM generate_series(1, 10000000);

CREATE INDEX i ON t (num1);

ANALYZE t;

And this query:

SELECT t1.num1, count(*) OVER ()
FROM t t1
WHERE EXISTS (SELECT FROM t t2 WHERE t1.num1 = t2.num2)
ORDER BY t1.num1
LIMIT 10;

Here's the query plan with default settings:

Limit (cost=0.43..38.89 rows=10 width=16) (actual
time=124082.486..124082.494 rows=10 loops=1)
-> WindowAgg (cost=0.43..38452467.86 rows=10000000 width=16)
(actual time=124082.485..124082.492 rows=10 loops=1)
-> Nested Loop Semi Join (cost=0.43..38327467.86
rows=10000000 width=8) (actual time=0.137..121315.724 rows=10000000 loops=1)
Join Filter: (t1.num1 = t2.num2)
Rows Removed by Join Filter: 1003470239
-> Index Only Scan using i on t t1
(cost=0.43..259840.43 rows=10000000 width=8) (actual time=0.120..1343.223
rows=10000000 loops=1)
Heap Fetches: 0
-> Materialize (cost=0.00..243118.00 rows=10000000
width=8) (actual time=0.000..0.004 rows=101 loops=10000000)
-> Seq Scan on t t2 (cost=0.00..154055.00
rows=10000000 width=8) (actual time=0.006..0.069 rows=569 loops=1)
Planning Time: 0.193 ms
Execution Time: 124105.899 ms

But there is a cheaper plan available (enable_indexscan=off,
enable_gathermerge=off):

Limit (cost=811708.65..811708.68 rows=10 width=16) (actual
time=8414.851..8414.855 rows=10 loops=1)
-> Sort (cost=811708.65..836708.65 rows=10000000 width=16) (actual
time=8338.783..8338.787 rows=10 loops=1)
Sort Key: t1.num1
Sort Method: top-N heapsort Memory: 25kB
-> WindowAgg (cost=179057.25..595612.25 rows=10000000
width=16) (actual time=6114.888..7336.230 rows=10000000 loops=1)
-> Hash Join (cost=179057.25..470612.25 rows=10000000
width=8) (actual time=2067.306..4130.853 rows=10000000 loops=1)
Hash Cond: (t1.num1 = t2.num2)
-> Seq Scan on t t1 (cost=0.00..154055.00
rows=10000000 width=8) (actual time=0.014..725.807 rows=10000000 loops=1)
-> Hash (cost=179056.00..179056.00 rows=100
width=8) (actual time=2067.281..2067.282 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> HashAggregate (cost=179055.00..179056.00
rows=100 width=8) (actual time=2067.253..2067.264 rows=100 loops=1)
Group Key: t2.num2
Batches: 1 Memory Usage: 24kB
-> Seq Scan on t t2
(cost=0.00..154055.00 rows=10000000 width=8) (actual time=0.005..853.380
rows=10000000 loops=1)
Planning Time: 0.235 ms
JIT:
Functions: 18
Options: Inlining true, Optimization true, Expressions true,
Deforming true
Timing: Generation 1.088 ms, Inlining 2.531 ms, Optimization 51.099
ms, Emission 22.251 ms, Total 76.970 ms
Execution Time: 8416.041 ms

Changing the window to only look at 500 rows doesn't impact the
chosen plan, but (as you suggested) makes the original plan much faster.
(This is with default settings again i.e. enable_indexscan=on,
enable_gathermerge=on):

SELECT t1.num1, count(*) OVER (ROWS BETWEEN CURRENT ROW AND 500
FOLLOWING)
FROM t t1
WHERE EXISTS (SELECT FROM t t2 WHERE t1.num1 = t2.num2)
ORDER BY t1.num1
LIMIT 10;

Limit (cost=0.43..38.89 rows=10 width=16) (actual time=4.489..4.587
rows=10 loops=1)
-> WindowAgg (cost=0.43..38452467.86 rows=10000000 width=16)
(actual time=4.487..4.581 rows=10 loops=1)
-> Nested Loop Semi Join (cost=0.43..38327467.86
rows=10000000 width=8) (actual time=0.129..4.274 rows=511 loops=1)
Join Filter: (t1.num1 = t2.num2)
Rows Removed by Join Filter: 13286
-> Index Only Scan using i on t t1
(cost=0.43..259840.43 rows=10000000 width=8) (actual time=0.030..0.226
rows=511 loops=1)
Heap Fetches: 0
-> Materialize (cost=0.00..243118.00 rows=10000000
width=8) (actual time=0.000..0.003 rows=27 loops=511)
-> Seq Scan on t t2 (cost=0.00..154055.00
rows=10000000 width=8) (actual time=0.009..0.018 rows=27 loops=1)
Planning Time: 0.427 ms
Execution Time: 4.657 ms

Tim

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Devrim Gündüz 2023-03-23 15:03:24 Re: BUG #17861: Disappearance of packages in repo for rhel-8
Previous Message Devrim Gündüz 2023-03-23 13:08:46 Re: BUG #17849: python3-etcd Missing from the postgres common repo for RHEL8