From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Window Function "Run Conditions" |
Date: | 2022-04-07 03:45:56 |
Message-ID: | CALNJ-vTaG0i+Sx4bruXDKNnppfMJS0nt7E=7Roo7VXAuxZFWiw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 6, 2022 at 7:36 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Wed, 6 Apr 2022 at 00:59, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> >
> > On Tue, Apr 5, 2022 at 7:49 PM David Rowley <dgrowleyml(at)gmail(dot)com>
> wrote:
> >> Yeah, there is more performance to be had than even what you've done
> >> there. There's no reason really for spool_tuples() to do
> >> tuplestore_puttupleslot() when we're not in run mode.
> >
> >
> > Yeah, this is a great idea.
>
> I've attached an updated patch that does most of what you mentioned.
> To make this work I had to add another state to the WindowAggStatus.
> This new state is what the top-level WindowAgg will move into when
> there's a PARTITION BY clause and the run condition becomes false.
> The new state is named WINDOWAGG_PASSTHROUGH_STRICT, which does all
> that WINDOWAGG_PASSTHROUGH does plus skips tuplestoring tuples during
> the spool. We must still spool those tuples when we're not the
> top-level WindowAgg so that we can send those out to any calling
> WindowAgg nodes. They'll need those so they return the correct result.
>
> This means that for intermediate WindowAgg nodes, when the
> runcondition becomes false, we only skip evaluation of WindowFuncs.
> WindowAgg nodes above us cannot reference these, so there's no need to
> evaluate them, plus, if there's a run condition then these tuples will
> be filtered out in the final WindowAgg node.
>
> For the top-level WindowAgg node, when the run condition becomes false
> we can save quite a bit more work. If there's no PARTITION BY clause,
> then we're done. Just return NULL. When there is a PARTITION BY
> clause we move into WINDOWAGG_PASSTHROUGH_STRICT which allows us to
> skip both the evaluation of WindowFuncs and also allows us to consume
> tuples from our outer plan until we get a tuple belonging to another
> partition. No need to tuplestore these tuples as they're being
> filtered out.
>
> Since intermediate WindowAggs cannot filter tuples, all the filtering
> must occur in the top-level WindowAgg. This cannot be done by way of
> the run condition as the run condition is special as when it becomes
> false, we don't check again to see if it's become true. A sort node
> between the WindowAggs can change the tuple order (i.e previously
> monotonic values may no longer be monotonic) so it's only valid to
> evaluate the run condition that's meant for the WindowAgg node it was
> intended for. To filter out the tuples that don't match the run
> condition from intermediate WindowAggs in the top-level WindowAgg,
> what I've done is introduced quals for WindowAgg nodes. This means
> that we can now see Filter in EXPLAIN For WindowAgg and "Rows Removed
> by Filter".
>
> Why didn't I just do the filtering in the outer query like was
> happening before? The problem is that when we push the quals down
> into the subquery, we don't yet have knowledge of which order that the
> WindowAggs will be evaluated in. Only run conditions from
> intermediate WindowAggs will ever make it into the Filter, and we
> don't know which one the top-level WindowAgg will be until later in
> planning. To do the filtering in the outer query we'd need to push
> quals back out the subquery again. It seems to me to be easier and
> better to filter them out lower down in the plan.
>
> Since the top-level WindowAgg node can now filter tuples, the executor
> node had to be given a for(;;) loop so that it goes around again for
> another tuple after it filters a tuple out.
>
> I've also updated the commit message which I think I've made quite
> clear about what we optimise and how it's done.
>
> > And I would suggest the below fastpath for this feature.
> > - if
> (check_and_push_window_quals(subquery, rte, rti, clause))
> > + if (!subquery->hasWindowFuncs ||
> check_and_push_window_quals(subquery, rte, rti, clause))
>
> Good idea. Thanks!
>
> David
>
Hi,
+ * We must keep the original qual in place if there is a
+ * PARTITION BY clause as the top-level WindowAgg remains in
+ * pass-through mode and does nothing to filter out unwanted
+ * tuples.
+ */
+ *keep_original = false;
The comment talks about keeping original qual but the assignment uses the
value false.
Maybe the comment can be rephrased so that it matches the assignment.
Cheers
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2022-04-07 04:05:19 | Re: Speed up transaction completion faster after many relations are accessed in a transaction |
Previous Message | David Rowley | 2022-04-07 03:25:06 | Re: How about a psql backslash command to show GUCs? |