Re: WIP: Upper planner pathification

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Upper planner pathification
Date: 2016-03-14 01:43:59
Message-ID: 9A28C8860F777E439AA12E8AEA7694F8011CA2F2@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I'm now checking the new planner implementation to find out the way to
integrate CustomPath to the upper planner also.
CustomPath node is originally expected to generate various kind of plan
node, not only scan/join, and its interface is designed to support them.
For example, we can expect a CustomPath that generates "CustomSort".

On the other hands, upper path consideration is more variable than the
case of scan/join path consideration. Probably, we can have no centralized
point to add custom-paths for sort, group-by, ...
So, I think we have hooks for each (supported) upper path workload.

In case of sorting for example, the best location of the hook is just
above of the Assert() in the create_ordered_paths(). It allows to compare
estimated cost between SortPath and CustomPath.
However, it does not allow to inject CustomPath(for sort) into the path
node that may involve sorting, like WindowPath or AggPath.
Thus, another hook may be put on create_window_paths and
create_grouping_paths in my thought.

Some other good idea?

Even though I couldn't check the new planner implementation entirely,
it seems to be the points below are good candidate to inject CustomPath
(and potentially ForeignScan).

- create_grouping_paths
- create_window_paths
- create_distinct_paths
- create_ordered_paths
- just below of the create_modifytable_path
(may be valuable for foreign-update pushdown)

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

> -----Original Message-----
> From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Sent: Saturday, March 05, 2016 3:02 AM
> To: David Rowley
> Cc: Robert Haas; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] WIP: Upper planner pathification
>
> OK, here is a version that I think addresses all of the recent comments:
>
> * I refactored the grouping-sets stuff as suggested by Robert and David.
> The GroupingSetsPath code is now used *only* when there are grouping sets,
> otherwise what you get is a plain AGG_SORTED AggPath. This allowed
> removal of a boatload of weird corner cases in the GroupingSets code path,
> so it was a good change. (Fundamentally, that's cleaning up some
> questionable coding in the grouping sets patch rather than fixing anything
> directly related to pathification, but I like the code better now.)
>
> * I refactored the handling of targetlists in createplan.c. After some
> reflection I decided that the disuse_physical_tlist callers fell into
> three separate categories: those that actually needed the exact requested
> tlist to be returned, those that wanted non-bloated tuples because they
> were going to put them into sort or hash storage, and those that needed
> grouping columns to be properly labeled. The new approach is to pass down
> a "flags" word that specifies which if any of these cases apply at a
> specific plan level. use_physical_tlist now always makes the right
> decision to start with, and disuse_physical_tlist is gone entirely, which
> should make things a bit faster since we won't uselessly construct and
> discard physical tlists. The missing logic from make_subplanTargetList
> and locate_grouping_columns is reincarnated in the physical-tlist code.
>
> * Added explicit limit/offset fields to LimitPath, as requested by Teodor.
>
> * Removed SortPath.sortgroupclauses.
>
> * Fixed handling of parallel-query fields in new path node types.
> (BTW, I found what seemed to be a couple of pre-existing bugs of
> the same kind, eg create_mergejoin_path was different from the
> other two kinds of join as to setting parallel_degree.)
>
>
> What remains to be done, IMV:
>
> * Performance testing as per yesterday's discussion.
>
> * Debug support in outfuncs.c and print_path() for new node types.
>
> * Clean up unfinished work on function header comments.
>
> * Write some documentation about how FDWs might use this.
>
> I'll work on the performance testing next. Barring unsatisfactory
> results from that, I think this could be committable in a couple
> of days.
>
> regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message James Sewell 2016-03-14 01:52:39 Re: Parallel Aggregate
Previous Message David Rowley 2016-03-14 01:30:35 Re: Parallel Aggregate