Re: Use extended statistics to estimate (Var op Var) clauses

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Use extended statistics to estimate (Var op Var) clauses
Date: 2021-07-20 18:28:20
Message-ID: c6284c56-1a83-18f3-90a7-35259f64a01e@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 7/5/21 2:46 PM, Dean Rasheed wrote:
> On Sun, 13 Jun 2021 at 21:28, Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> Here is a slightly updated version of the patch
>>
>
> Hi,
>
> I have looked at this in some more detail, and it all looks pretty
> good, other than some mostly cosmetic stuff.
>

Thanks for the review!

> The new code in statext_is_compatible_clause_internal() is a little
> hard to follow because some of the comments aren't right (e.g. when
> checking clause_expr2, it isn't an (Expr op Const) or (Const op Expr)
> as the comment says). Rather than trying to comment on each
> conditional branch, it might be simpler to just have a single
> catch-all comment at the top, and also remove the "return true" in the
> middle, to make it something like:
>
> /*
> * Check Vars appearing on either side by recursing, and make a note of
> * any expressions.
> */
> if (IsA(clause_expr, Var))
> {
> if (!statext_is_compatible_clause_internal(...))
> return false;
> }
> else
> *exprs = lappend(*exprs, clause_expr);
>
> if (clause_expr2)
> {
> if (IsA(clause_expr2, Var))
> {
> if (!statext_is_compatible_clause_internal(...))
> return false;
> }
> else
> *exprs = lappend(*exprs, clause_expr2);
> }
>
> return true;
>

I ended up doing something slightly different - examine_opclause_args
now "returns" a list of expressions, instead of explicitly setting two
parameters. That means we can do a simple foreach() here, which seems
cleaner. It means we have to extract the expressions from the list in a
couple places, but that seems acceptable. Do you agree?

I also went through the comments and updated those that seemed wrong.

> Is the FIXME comment in examine_opclause_args() necessary? The check
> for a single relation has already been done in
> clause[list]_selectivity_ext(), and I'm not sure what
> examine_opclause_args() would do differently.
>

Yeah, I came to the same conclusion.

> In mcv_get_match_bitmap(), perhaps do the RESULT_IS_FINAL() checks
> first in each loop.
>

This is how master already does that now, and I wonder if it's done in
this order intentionally. It's not clear to me doing it in the other way
would be faster?

> Also in mcv_get_match_bitmap(), the 2 "First check whether the
> constant is below the lower boundary ..." comments don't make any
> sense to me. Were those perhaps copied and pasted from somewhere else?
> They should perhaps say "Otherwise, compare the MCVItem with the
> constant" and "Otherwise compare the values from the MCVItem using the
> clause operator", or something like that.
>

Yeah, that's another bit that comes from current master - the patch just
makes a new copy of the comment. I agree it's bogus, Seems like a
remainder of the original code which did various "smart" things we
removed over time. Will fix.

> But other than such cosmetic things, I think the patch is good, and
> gives some nice estimate improvements.
>

Thanks, sounds good. I guess the last thing is maybe mentioning this in
the docs, adding an example etc.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0001-Handling-Expr-op-Expr-clauses-in-extended-stats-20210720.patch text/x-patch 25.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-07-20 18:32:27 Re: .ready and .done files considered harmful
Previous Message Andres Freund 2021-07-20 18:00:39 something is wonky with pgbench pipelining