Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs
Date: 2015-01-06 06:50:58
Message-ID: CAFjFpRcTr-A6x_XwQmv+DWEkVs_Ti_v6P3XLLOKCEOZOQmc3ag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 5, 2015 at 8:42 PM, Atri Sharma <atri(dot)jiit(at)gmail(dot)com> wrote:

>
> Hi All,
>
> Please forgive if this is a repost.
>
> Please find attached patch for supporting ORDER BY clause in CREATE
> FUNCTION for SRFs. Specifically:
>
> CREATE OR REPLACE FUNCTION func1(OUT e int, OUT f int) returns setof
> record as ' SELECT a,b FROM table1 ORDER BY a; ' language 'sql' ORDER BY e;
>
> This shall allow for capturing information about existing preorder that
> might be present inherently in the SRF's input or algorithm (the above
> example and think generate_series).
>
> This allows for eliminating sorts that can be based off the known existing
> preorder. For eg:
>
> SELECT * FROM correct_order_singlecol() ORDER BY e; # Does not need to
> sort by e since existing preorder is known.
>
> Eliminating such sorts can be a huge gain, especially if the expected
> input to needed Sort node is large.
>
> The obvious question that comes is what happens if specified ORDER BY
> clause is false. For checking the order, a new node is added which is top
> node of the plan and is responsible for projecting result rows. It tracks
> the previous row seen and given a sort order, ensures that the current
> tuple to be projected is in the required sort order.
>
> So, for above example
>
> EXPLAIN (COSTS OFF) SELECT * FROM correct_order_multicol() ORDER BY e;
> QUERY PLAN
> -----------------------------------------------
> OrderCheck
> -> Function Scan on correct_order_multicol
> (2 rows)
>
>
> If order of result rows is not the same as required, an error is raised:
>
> SELECT * FROM incorrect_order_nulls() ORDER BY e NULLS LAST;
> ERROR: Order not same as specified
>
>
>
> Preorder columns are first transformed into SortGroupClauses first and
> then stored directly in pg_proc.
>
>
> This functionality is a user case seen functionality, and is especially
> useful when SRF inputs are large and/or might be pipelined from another
> function (SRFs are used in pipelines in analytical systems many times, with
> large data).
>
> The overhead of this patch is small. A new path is added for the preorder
> keys, and OrderCheck node's additional cost is pretty low, given that it
> only compares two rows and stores only a single row (previous row seen),
> hence the memory footprint is minuscule.
>
>
We can eliminate the new node and put onus or having the right order on the
user like we do with volatile setting of the function.

> In the inner joins thread, Tom mentioned having a new node which has
> multiple plans and executor can decide which plan to execute given runtime
> conditions. I played around with the idea, and am open to experiment having
> a new node which has a Sort based plan and is executed in case OrderCheck
> node sees that the inherent order of result tuples is not correct. Feedback
> here would be very welcome.
>
>
> I will add the patch to current commitfest.
>
> Thoughts?
>
> Regards,
>
> Atri
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Atri Sharma 2015-01-06 06:53:53 Re: Patch to add functionality to specify ORDER BY in CREATE FUNCTION for SRFs
Previous Message Amit Langote 2015-01-06 06:40:40 Re: On partitioning