Planner deficiencies with views that use windowing functions

From: Daniel Grace <dgrace(at)wingsnw(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Planner deficiencies with views that use windowing functions
Date: 2010-06-29 21:51:19
Message-ID: AANLkTik-vXDXQmGQisJ13ip1siAqAF498g_jG4MB-K4D@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I found some interesting discrepancies with the query planner when
working with views (and presumably table subqueries) that contain
windowing functions.  This is all tested on 9.0beta2, though I suspect
8.4 has the same limitations.

The short version is:
- If querying a windowing view and asking only for columns that are
not derived from window functions, the window functions are still
applied even though they would have no impact in the output.
- If querying a windowing view using a WHERE clause, the entire view
must still be scanned once for each window function involved.  In some
cases this is to be expected (otherwise row values could vary based on
whether the where clause was specified or not) -- however:
* If the WHERE clause includes a column that is named by PARTITION
BY for every involved windowing function, it is guarunteed (I think?)
to not have any impact on the actual results and thus that portion of
it ought to be folded in to the original query.
* If the WHERE clause includes a column named in some, but not all
PARTITION BYs, the planner might be able to limit what rows are
scanned for those WindowAggs
* If my logic is correct (it might not be), and the WHERE clause
looks like "WHERE a=1 AND b=2", and each partition names at least one
of those columns, the inner query for the view itself could be
rewritten to "WHERE a=1 OR b=2" (note change of AND to OR). The
windowing functions would produce erroneous results for rows that
match only one condition, but it should be correct for rows that match
both -- and the rows that don't will be filtered out anyways.
- Note that by 'all windowing functions', I mean only those that are
named as columns in the outer select.

The long version, with test cases and EXPLAIN output, is the rest of
this message:

DROP SCHEMA IF EXISTS windowtest CASCADE;
CREATE SCHEMA windowtest;
SET search_path=windowtest;

CREATE TABLE numbers AS
SELECT
f.v::INTEGER AS v,
(f.v % 321)::INTEGER AS c -- Give us something to partition by.
FROM GENERATE_SERIES(1, 10000) AS f(v);

ALTER TABLE numbers ADD PRIMARY KEY(c,v);

-- This is the basic query we'll use.
EXPLAIN ANALYZE SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;

-- We'll also use a view based on the same query.
CREATE VIEW numbers_view AS SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;

-- Case 1: Against the original query, let's ask for some data that doesn't
-- use any of the window functions but still defines the windows
EXPLAIN ANALYZE SELECT
c, v
FROM
numbers
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;
-- Seq Scan on numbers (cost=0.00..220.00 rows=10000 width=8) (actual
time=0.011..4.572 rows=10000 loops=1)
-- Total runtime: 6.657 ms
-- Summary: The planner notices the windows aren't actually being used, so
-- it eliminates them.

-- Case 2: What if we still use the windowing functions, but only want a
-- couple categories?
EXPLAIN ANALYZE SELECT
c, v,
FIRST_VALUE(v) OVER next_v_in_c AS next_v,
LAST_VALUE(v) OVER same_thousands AS last_in_thousands
FROM
numbers
WHERE
c < 5
WINDOW
next_v_in_c AS ( PARTITION BY c ORDER BY v ASC ROWS BETWEEN 1
FOLLOWING AND 1 FOLLOWING ),
same_thousands AS ( PARTITION BY c, FLOOR(v/1000) RANGE BETWEEN
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
;
-- "WindowAgg (cost=531.93..623.59 rows=3333 width=8) (actual
time=0.496..0.719 rows=159 loops=1)"
-- " -> Sort (cost=531.93..540.26 rows=3333 width=8) (actual
time=0.492..0.524 rows=159 loops=1)"
-- " Sort Key: c, (floor(((v / 1000))::double precision))"
-- " Sort Method: quicksort Memory: 28kB"
-- " -> WindowAgg (cost=0.00..336.91 rows=3333 width=8)
(actual time=0.026..0.399 rows=159 loops=1)"
-- " -> Index Scan using numbers_pkey on numbers
(cost=0.00..278.58 rows=3333 width=8) (actual time=0.010..0.153
rows=159 loops=1)"
-- " Index Cond: (c < 5)"
-- "Total runtime: 0.785 ms"
-- Summary: Works about as well as we can hope for.

-- Case 3: What if we ask for rows from the view, but we don't ask for any
-- window functions?
EXPLAIN ANALYZE SELECT c, v FROM numbers_view;
-- "Subquery Scan on numbers_view (cost=1289.64..1664.64 rows=10000
width=8) (actual time=28.309..47.456 rows=10000 loops=1)"
-- " -> WindowAgg (cost=1289.64..1564.64 rows=10000 width=8)
(actual time=28.307..42.888 rows=10000 loops=1)"
-- " -> Sort (cost=1289.64..1314.64 rows=10000 width=8)
(actual time=28.301..30.477 rows=10000 loops=1)"
-- " Sort Key: numbers.c, (floor(((numbers.v /
1000))::double precision))"
-- " Sort Method: quicksort Memory: 950kB"
-- " -> WindowAgg (cost=0.00..625.25 rows=10000
width=8) (actual time=0.019..23.280 rows=10000 loops=1)"
-- " -> Index Scan using numbers_pkey on numbers
(cost=0.00..450.25 rows=10000 width=8) (actual time=0.009..8.984
rows=10000 loops=1)"
-- "Total runtime: 49.513 ms"
-- Summary: Wait a second here. We're performing the windowing
functions on the view, even though the work of performing them is
ultimately discarded?

-- The planner also isn't optimal when using a WHERE clause against the view
-- In some cases, this makes sense:
-- Case 4:
EXPLAIN ANALYZE SELECT * FROM numbers_view WHERE v < 600;
-- "Subquery Scan on numbers_view (cost=1289.64..1689.64 rows=3333
width=16) (actual time=28.859..45.885 rows=599 loops=1)"
-- " Filter: (numbers_view.v < 600)"
-- " -> WindowAgg (cost=1289.64..1564.64 rows=10000 width=8)
(actual time=28.854..43.079 rows=10000 loops=1)"
-- " -> Sort (cost=1289.64..1314.64 rows=10000 width=8)
(actual time=28.847..30.928 rows=10000 loops=1)"
-- " Sort Key: numbers.c, (floor(((numbers.v /
1000))::double precision))"
-- " Sort Method: quicksort Memory: 950kB"
-- " -> WindowAgg (cost=0.00..625.25 rows=10000
width=8) (actual time=0.038..23.701 rows=10000 loops=1)"
-- " -> Index Scan using numbers_pkey on numbers
(cost=0.00..450.25 rows=10000 width=8) (actual time=0.021..9.106
rows=10000 loops=1)"
-- "Total runtime: 46.092 ms"
-- Optimizing rows 600+ out of the result would result in the returned
-- records differing from what they would be on a full scan of the view, which
-- is bad: As a view should act exactly like a table in this context -- any
-- given tuple's values should be the same regardless of the WHERE clause that
-- retrieved it. Thus, the behavior here is more or less correct.

-- Case 5:
EXPLAIN ANALYZE SELECT * FROM numbers_view WHERE v < 50;
-- In this case, however, the planner COULD be smart enough to know that there
-- are no values of c higher than 50 in this case and optimize accordingly.
-- This is minor, however.

-- Case 6:
EXPLAIN ANALYZE SELECT * FROM numbers_view WHERE c < 5;
-- "Subquery Scan on numbers_view (cost=1289.64..1689.64 rows=3333
width=16) (actual time=28.981..45.939 rows=159 loops=1)"
-- " Filter: (numbers_view.c < 5)"
-- " -> WindowAgg (cost=1289.64..1564.64 rows=10000 width=8)
(actual time=28.977..43.221 rows=10000 loops=1)"
-- " -> Sort (cost=1289.64..1314.64 rows=10000 width=8)
(actual time=28.971..31.089 rows=10000 loops=1)"
-- " Sort Key: numbers.c, (floor(((numbers.v /
1000))::double precision))"
-- " Sort Method: quicksort Memory: 950kB"
-- " -> WindowAgg (cost=0.00..625.25 rows=10000
width=8) (actual time=0.033..23.904 rows=10000 loops=1)"
-- " -> Index Scan using numbers_pkey on numbers
(cost=0.00..450.25 rows=10000 width=8) (actual time=0.017..9.184
rows=10000 loops=1)"
-- "Total runtime: 46.066 ms"
-- c is named in every windowing function's PARTITION BY, thus
filtering by it at the view level would have no impact on the actual
results returned.
-- Thus, this SHOULD optimize just like case 2.

--
Daniel Grace
AGE, LLC
System Administrator and Software Developer

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2010-06-29 21:51:48 Re: Cannot cancel the change of a tablespace
Previous Message David Christensen 2010-06-29 21:28:10 Re: GSoC - code of implementation of materialized views