Re: upper planner path-ification

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: upper planner path-ification
Date: 2015-05-15 04:41:38
Message-ID: CAFjFpRc_9y1Tx=fq+82TCyJE4ToRuM+1+oonEkutKDfvMpHAnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 14, 2015 at 7:09 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Hi,
>
> I've been pulling over Tom's occasional remarks about redoing
> grouping_planner - and maybe further layers of the planner - to work
> with Paths instead of Plans. I've had difficulty locating all of the
> relevant threads. Here's everything I've found so far, which I'm
> pretty sure is not everything:
>
> http://www.postgresql.org/message-id/17400.1311716272@sss.pgh.pa.us
> http://www.postgresql.org/message-id/2614.1375730493@sss.pgh.pa.us
> http://www.postgresql.org/message-id/22721.1385048662@sss.pgh.pa.us
>
> http://www.postgresql.org/message-id/BANLkTinDjjFHNOzESG2J2U4GOkqLu69Zqg@mail.gmail.com
> http://www.postgresql.org/message-id/8479.1418420018@sss.pgh.pa.us
>
> I think there are two separate problems here. First, there's the
> problem that grouping_planner() is complicated. It's doing cost
> comparisons, but all in ad-hoc fashion rather than using a consistent
> mechanic the way add_path() does. Generally, we can plan an aggregate
> using either (1) a hashed aggregate, (2) a sorted aggregate, or (3)
> for min or max, an index scan that just grabs the highest or lowest
> value in lieu of a full table scan. Instead of generating a plan for
> each of these possibilities, we'd like to generate paths for each one,
> and then pick one to turn into a plan. AIUI, the hope is that this
> would simplify the cost calculations, and also make it easier to
> inject other paths, such as a path where an FDW performs the
> aggregation step on the remote server.
>
> Second, there's the problem that we might like to order aggregates
> with respect to joins. If we have something like SELECT DISTINCT ON
> (foo.x) foo.x, foo.y, bar.y FROM foo, bar WHERE foo.x = bar.x, then
> (a) if foo.x has many duplicates, it will be better to DISTINCT-ify
> foo and then join to bar afterwards but (b) if foo.x = bar.x is highly
> selective, it will be better to join to bar first and then
> DISTINCT-ify the result. Currently, aggregation is always last;
> that's not great. Hitoshi Harada's proposed strategy of essentially
> figuring out where the aggregation steps can go and then re-planning
> for each one is also not great, because each join problem will be a
> subtree of the one we use for the aggregate-last strategy, and thus
> we're wasting effort by planning the same subtrees multiple times.
> Instead, we might imagine letting grouping planner fish out the best
> paths for the joinrels that represent possible aggregation points,
> generate aggregation paths for each of those, and then work out what
> additional rels need to be joined afterwards. That sounds hard, but
> not as hard as doing something sensible with what we have today.
>
>
Another futuristic avenue for optimization would be commuting Append and
Join when joining two similarly partitioned tables.

> I'm inclined to think that it would be useful to solve the first
> problem even if we didn't solve the second one right away (but that
> might be wrong). As a preparatory step, I'm thinking it would be
> sensible to split grouping_planner() into an outer function that would
> handle the addition of Limit and LockRows nodes and maybe planning of
> set operations, and an inner function that would handle GROUP BY,
> DISTINCT, and possibly window function planning.
>
> Thoughts?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2015-05-15 04:46:50 Re: best place for "rtree" strategy numbers
Previous Message Tom Lane 2015-05-15 02:35:49 Re: Fix token exceeding NAMELEN