Re: [PATCH] Overestimated filter cost and its mitigation

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Yuto Hayamizu <y(dot)hayamizu(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Overestimated filter cost and its mitigation
Date: 2017-11-06 04:31:59
Message-ID: CAEepm=2hEdXzVWvDtRFodkmi8uX0CHhTFNP6gmGo1ordC+Ugvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 11, 2017 at 7:43 PM, Yuto Hayamizu <y(dot)hayamizu(at)gmail(dot)com> wrote:
> Suppose that there are three qual clauses in a scan node, current
> postgres estimates per-tuple cost of the filter to be:
> cost(A) + cost(B) + cost(C)
>
> And basic idea of the attached patch is:
> cost(A) + clauselist_selectivity({A}) * cost(B) +
> clauselist_selectivity({A, B}) * cost(C)

I am no planner expert and I haven't tested or studied the patch in
detail, but here is some feedback for what it's worth.

This idea seems to makes intuitive sense. I see that you use
order_qual_clauses() to know what order they'll run in, so I'm
wondering if there is any reason we shouldn't do it up front and keep
it during path building, instead of running it again at plan creation
time. Is there some way it could ever produce a different result at
the two times? Why not also apply this logic to qpquals of joins,
foreign scans, subplans? That is, instead of replacing cost_qual_eval
with this code for baserels, I wonder if we should teach
cost_qual_eval how to do this so those other users could also benefit
(having perhaps already ordered the qual clauses earlier).

This is one of those patches that risks having an impact on many query
plans. Yikes. Of all the regression test queries, only
updatable_views complained though, and that involves leakproof
functions. I see that those get some kind of special treatment in
order_qual_clauses().

+ ordered_clauses = order_qual_clauses(root, rel->baserestrictinfo);
+ clause_list = ordered_clauses;

Is clause_list necessary? Can't you just use ordered_clauses for the
outer and inner loops?

+ List *clause_list_for_sel = NULL;

The convention is to use NIL for empty lists (a nod to the Lisp
machine they prototyped this project on).

+ /* Make a temporary clause list for selectivity calcuation */

s/calcuation/calculation/

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-11-06 04:34:00 Re: [Sender Address Forgery]Re: path toward faster partition pruning
Previous Message Amit Langote 2017-11-06 04:30:02 Re: [Sender Address Forgery]Re: path toward faster partition pruning