Re: How to retain lesser paths at add_path()?

From: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)heterodb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Richard Guo <riguo(at)pivotal(dot)io>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: How to retain lesser paths at add_path()?
Date: 2020-11-05 15:41:31
Message-ID: 20201105154131.4q3sp2iwqru4hgv2@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Tue, Jan 14, 2020 at 12:46:02AM +0900, Kohei KaiGai wrote:
> The v2 patch is attached.
>
> This adds two dedicated lists on the RelOptInfo to preserve lesser paths
> if extension required to retain the path-node to be removed in usual manner.
> These lesser paths are kept in the separated list, so it never expand the length
> of pathlist and partial_pathlist. That was the arguable point in the discussion
> at the last October.
>
> The new hook is called just before the path-node removal operation, and
> gives extension a chance for extra decision.
> If extension considers the path-node to be removed can be used in the upper
> path construction stage, they can return 'true' as a signal to preserve this
> lesser path-node.
> In case when same kind of path-node already exists in the preserved_pathlist
> and the supplied lesser path-node is cheaper than the old one, extension can
> remove the worse one arbitrarily to keep the length of preserved_pathlist.
> (E.g, PG-Strom may need one GpuJoin path-node either pathlist or preserved-
> pathlist for further opportunity of combined usage with GpuPreAgg path-node.
> It just needs "the best GpuJoin path-node" somewhere, not two or more.)
>
> Because PostgreSQL core has no information which preserved path-node can
> be removed, extensions that uses path_removal_decision_hook() has responsibility
> to keep the length of preserved_(partial_)pathlist reasonable.

Hi,

Thanks for the patch! I had a quick look at it and have a few questions:

* What would be the exact point/hook at which an extension can use
preserved pathlists? I guess it's important, since I can imagine it's
important for one of the issues mentioned in the thread about such an
extension have to re-do significant part of the calculations from
add_path.

* Do you have any benchmark results with some extension using this
hook? The idea with another pathlist of "discarded" paths sounds like
a lightweight solution, and indeed I've performed few tests with two
workloads (simple queries, queries with joins of 10 tables) and the
difference between master and patched versions is rather small (no
stable difference for the former, couple of percent for the latter).
But it's of course with an empty hook, so it would be good to see
other benchmarks as well.

* Does it make sense to something similar with add_path_precheck,
which also in some situations excluding paths?

* This part sounds dangerous for me:

> Because PostgreSQL core has no information which preserved path-node can
> be removed, extensions that uses path_removal_decision_hook() has responsibility
> to keep the length of preserved_(partial_)pathlist reasonable.

since an extension can keep limited number of paths in the list, but
then the same hook could be reused by another extension which will
also try to limit such paths, but together they'll explode.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-11-05 15:53:56 Re: redundant error messages
Previous Message Seino Yuki 2020-11-05 14:54:53 Re: enable pg_stat_statements to track rows processed by REFRESH MATERIALIZED VIEW