Skip site navigation (1) Skip section navigation (2)

Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Date: 2016-08-22 21:40:23
Message-ID: 20160822214023.aaxz5l4igypowyri@alap3.anarazel.de (view raw, whole thread or download thread mbox)
Thread:
Lists: pgsql-hackers
Hi,

as noted in [1] I started hacking on removing the current implementation
of SRFs in the targetlist (tSRFs henceforth). IM discussion brought the
need for a description of the problem, need and approach to light.

There are several reasons for wanting to get rid of tSRFs. The primary
ones in my opinion are that the current behaviour of several SRFs in one
targetlist is confusing, and that the implementation burden currently is
all over the executor.  Especially the latter is what is motivating me
working on this, because it blocks my work on making the executor faster
for queries involving significant amounts of tuples.  Batching is hard
if random places in the querytree can icnrease the number of tuples.

The basic idea, hinted at in several threads, is, at plan time, to convert a query like
SELECT generate_series(1, 10);
into
SELECT generate_series FROM ROWS FROM(generate_series(1, 10));

thereby avoiding the complications in the executor (c.f. execQual.c
handling of isDone/ExprMultipleResult and supporting code in many
executor nodes / node->*.ps.ps_TupFromTlist).

There are several design questions along the way:

1) How to deal with the least-common-multiple behaviour of tSRFs. E.g.
=# SELECT generate_series(1, 3), generate_series(1,2);
returning
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               1 │
│               2 │               2 │
│               3 │               1 │
│               1 │               2 │
│               2 │               1 │
│               3 │               2 │
└─────────────────┴─────────────────┘
(6 rows)
but
=# SELECT generate_series(1, 3), generate_series(5,7);
returning
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               5 │
│               2 │               6 │
│               3 │               7 │
└─────────────────┴─────────────────┘

discussion in this thread came, according to my reading, to the
conclusion that that behaviour is just confusing and that the ROWS FROM
behaviour of
=# SELECT * FROM ROWS FROM(generate_series(1, 3), generate_series(1,2));
┌─────────────────┬─────────────────┐
│ generate_series │ generate_series │
├─────────────────┼─────────────────┤
│               1 │               1 │
│               2 │               2 │
│               3 │          (null) │
└─────────────────┴─────────────────┘
(3 rows)

makes more sense.  We also discussed erroring out if two SRFs  return
differing amount of rows, but that seems not to be preferred so far. And
we can easily add it if we want.


2) A naive conversion to ROWS FROM, like in the example in the
   introductory paragraph, can change the output, when implemented as a
   join from ROWS FROM to the rest of the query, rather than the other
   way round. E.g.
=# EXPLAIN SELECT * FROM few, ROWS FROM(generate_series(1,10));
┌──────────────────────────────────────────────────────────────────────────────┐
│                                  QUERY PLAN                                  │
├──────────────────────────────────────────────────────────────────────────────┤
│ Nested Loop  (cost=0.00..36.03 rows=2000 width=8)                            │
│   ->  Function Scan on generate_series  (cost=0.00..10.00 rows=1000 width=4) │
│   ->  Materialize  (cost=0.00..1.03 rows=2 width=4)                          │
│         ->  Seq Scan on few  (cost=0.00..1.02 rows=2 width=4)                │
└──────────────────────────────────────────────────────────────────────────────┘
(4 rows)
=# SELECT * FROM few, ROWS FROM(generate_series(1,3));
┌────┬─────────────────┐
│ id │ generate_series │
├────┼─────────────────┤
│  1 │               1 │
│  2 │               1 │
│  1 │               2 │
│  2 │               2 │
│  1 │               3 │
│  2 │               3 │
└────┴─────────────────┘
(6 rows)
surely isn't what was intended.  So the join order needs to be enforced.

3) tSRFs are evaluated after GROUP BY, and window functions:
=# SELECT generate_series(1, count(*)) FROM (VALUES(1),(2),(10)) f;
┌─────────────────┐
│ generate_series │
├─────────────────┤
│               1 │
│               2 │
│               3 │
└─────────────────┘
which means we have to push the "original" query into a subquery, with
the ROWS FROM laterally referencing the subquery:
SELECT generate_series FROM (SELECT count(*) FROM (VALUES(1),(2),(10)) f) s, ROWS FROM (generate_series(1,s.count));

4) The evaluation order of tSRFs in combination with ORDER BY is a bit
   confusing. Namely tSRFs are implemented after ORDER BY has been
   evaluated, unless the ORDER BY references the SRF.
E.g.
=# SELECT few.id, generate_series FROM ROWS FROM(generate_series(1,3)),few ORDER BY few.id DESC;
might return
┌────┬─────────────────┐
│ id │ generate_series │
├────┼─────────────────┤
│ 24 │               3 │
│ 24 │               2 │
│ 24 │               1 │
..
instead of
┌────┬─────────────────┐
│ id │ generate_series │
├────┼─────────────────┤
│ 24 │               1 │
│ 24 │               2 │
│ 24 │               3 │
as before.

which means we'll sometimes have to push down the ORDER BY into the
subquery (when not referencing tSRFs, so they're evaluated first),
sometimes evaluate them on the outside (if tSRFs are referenced)

5) tSRFs can have tSRFs as argument, e.g.:
=# SELECT generate_series(1, generate_series(1,3));
┌─────────────────┐
│ generate_series │
├─────────────────┤
│               1 │
│               1 │
│               2 │
│               1 │
│               2 │
│               3 │
└─────────────────┘
that can quite easily be implemented by having the "nested" tSRF
evaluate as a separate ROWS FROM expression.

Which even allows us to implement the previously forbidden
=# SELECT generate_series(generate_series(1,3), generate_series(2,4));
ERROR:  0A000: functions and operators can take at most one set argument

- not that I think that's of great value ;)

6) SETOF record type functions cannot directly be used in ROWS FROM() -
   as ROWS FROM "expands" records returned by functions. When converting
   something like
CREATE OR REPLACE FUNCTION setof_record_sql() RETURNS SETOF record LANGUAGE sql AS $$SELECT 1 AS a, 2 AS b UNION ALL SELECT 1, 2;$$;
SELECT setof_record_sql();
   we don't have that available though.

The best way to handle that seems to be to introduce the ability for
ROWS FROM not to expand the record returned by a column.  I'm currently
thinking that something like ROWS FROM(setof_record_sql() AS ()) would
do the trick.  That'd also considerably simplify the handling of
functions returning known composite types - my current POC patch
generates a ROW(a,b,..) type expression for those.

I'm open to better syntax suggestions.

7) ROWS FROM () / functions in the FROM list are currently signifcantly
   slower than the equivalent in the target list (for SFRM_ValuePerCall
   SRFs at least):

=# COPY (SELECT generate_series(1,10000000)) TO '/dev/null';
COPY 10000000
Time: 1311.469 ms
=# COPY (SELECT * FROM generate_series(1,10000000)) TO '/dev/null';
LOG:  00000: temporary file: path "base/pgsql_tmp/pgsql_tmp702.0", size 140000000
LOCATION:  FileClose, fd.c:1484
COPY 10000000
Time: 2173.282 ms
for SRFM_Materialize SRFs there's no meaningufl difference:
CREATE FUNCTION plpgsql_generate_series(bigint, bigint) RETURNS SETOF bigint LANGUAGE plpgsql AS $$BEGIN RETURN QUERY SELECT generate_series($1, $2);END;$$;

=# COPY (SELECT plpgsql_generate_series(1,10000000)) TO '/dev/null';
LOG:  00000: temporary file: path "base/pgsql_tmp/pgsql_tmp702.2", size 180000000
COPY 10000000
Time: 3058.437 ms

=# COPY (SELECT * FROM plpgsql_generate_series(1,10000000)) TO '/dev/null';LOG:  00000: temporary file: path "base/pgsql_tmp/pgsql_tmp702.1", size 180000000
COPY 10000000
Time: 2964.661 ms

that makes sense, because nodeFunctionscan.c, via
ExecMakeTableFunctionResult, forces materialization of ValuePerCall
SRFs.

ISTM that we need should fix that by allowing ValuePerCall without
materialization, as long as EXEC_FLAG_BACKWARD isn't required.


I've implemented ([2]) a prototype of this. My basic approach is:

I) During parse-analysis, remember whether a query has any tSRFs
   (Query->hasTargetSRF). That avoids doing a useless pass over the
   query, if no tSRFs are present.
II) At the beginning of subquery_planner(), before doing any work
   operating on subqueries and such, implement SRFs if ->hasTargetSRF().
   (unsrfify() in the POC)
III) Unconditionally move the "current" query into a subquery. For that
   do a mutator pass over the query, replacing Vars/Aggrefs/... in the
   original targetlist with Var references to the new subquery.
   (unsrfify_reference_subquery_mutator() in the POC)
IV) Do a pass over the outer query's targetlist, and implement any tSRFs
   using a ROWS FROM() RTE (or multiple ones in case of nested tSRFs).
   (unsrfify_implement_srfs_mutator() in the POC)

that seems to mostly work well.

The behaviour changes this implies are:

a) Least-common-multiple behaviour, as in (1) above, is gone. I think
   that's good.

b) We currently allow tSRFs in UPDATE ... SET expressions. I don't
   actually know what that's supposed to mean. I'm inclined
   a;
=# CREATE TABLE blarg AS SELECT 1::int a;
SELECT 1
=# UPDATE blarg SET a = generate_series(2,3);
UPDATE 1
=# SELECT * FROM blarg ;
┌───┐
│ a │
├───┤
│ 2 │
└───┘
I'm inclined to think that that's a bad idea, and should rather be
forbidden.

c) COALESCE/CASE have, so far, shortcut tSRF expansion. E.g.
   SELECT id, COALESCE(1, generate_series(1,2)) FROM (VALUES(1),(2)) few(id);
   returns only two rows, despite the generate_series().  But by
   implementing the generate_series as a ROWS FROM, it'd return four.

I think that's ok.

d) Not a problem with the patch per-se, but I'm doubful that that's ok:
=# SELECT 1 ORDER BY generate_series(1, 10);
returns 10 rows ;) - maybe we should forbid that?

As the patch currently stands, the diffstat is
 56 files changed, 953 insertions(+), 1599 deletions(-)
which isn't bad. I'd guess that a few more lines are needed, but I'd
still bet it's a net negative code-wise.

Regards,

Andres Freund

[1] http://archives.postgresql.org/message-id/20160801082346.nfp2g7mg74alifdc%40alap3.anarazel.de
[2] http://archives.postgresql.org/message-id/20160804032203.jprhdkx273sqhksd%40alap3.anarazel.de


In response to

Responses

pgsql-hackers by date

Next:From: Peter GeogheganDate: 2016-08-22 21:43:13
Subject: Re: Parallel tuplesort (for parallel B-Tree index creation)
Previous:From: Alvaro HerreraDate: 2016-08-22 21:33:57
Subject: Re: pg_bsd_indent - improvements around offsetof and sizeof

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group