Re: Making clausesel.c Smarter

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Making clausesel.c Smarter
Date: 2017-04-03 08:59:42
Message-ID: CAKJS1f_M-KWEkaoHRsxXWfEUoq6hSfuOzY6-EF+D9QhkZ1d_OA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 April 2017 at 16:42, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>
> I thought to take a quick look at this patch. I'll probably take a
> deeper look later, but some initial comments:
>
> -typedef struct RangeQueryClause
> +typedef struct CachedSelectivityClause
> {
> - struct RangeQueryClause *next; /* next in linked list */
> + struct CachedSelectivityClause *next; /* next in linked list */
> Node *var; /* The common variable of the clauses */
> - bool have_lobound; /* found a low-bound clause yet? */
> - bool have_hibound; /* found a high-bound clause yet? */
> + int selmask; /* Bitmask of which sel types are stored */
> Selectivity lobound; /* Selectivity of a var > something clause */
> Selectivity hibound; /* Selectivity of a var < something clause */
> -} RangeQueryClause;
>
>
> As seems customary in other code, perhaps you should define some
> HAS_X() macros for dealing with the selmask.
>
> Something like
>
> #SELMASK_HAS_LOBOUND(selmask) (((selmask) & CACHESEL_LOBOUND) != 0)
> etc..

Thanks for looking at the patch.

The problem with that is that some of the tests are doing more than
just checking a single flag is enabled.

Consider:

if ((cslist->selmask & (CACHESEL_LOBOUND | CACHESEL_HIBOUND)) ==
(CACHESEL_LOBOUND | CACHESEL_HIBOUND))

Of course, those could be Macro'd too, but it seems I might need to be
inventive with the names, or the meaning might not be all that clear.

It does not seem overly complex and it is isolated to a single file,
so perhaps it might be OK as is?

>
> +static void addCachedSelectivityRangeVar(CachedSelectivityClause
> **cslist, Node *var,
> bool varonleft, bool isLTsel, Selectivity s2);
>
> You're changing clause -> var throughout the code when dealing with
> nodes, but looking at their use, they hold neither. All those
> addCachedSelectivity functions are usually passed expression nodes. If
> you're renaming, perhaps you should rename to "expr".

hmm, this is true. I kind of followed the lead of the name of the
variable in the old RangeQueryClause struct. I have changed the names
of these to be expr in the attached, but I didn't change the name of
the Var in the new CachedSelectivityClause struct. It seems like a
change not related to this patch.

Do you think I should change that too?

>
> On Fri, Feb 24, 2017 at 7:00 AM, David Rowley
> <david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> > Now one thing I was unsure of in the patch was this "Other clauses"
> > concept that I've come up with. In the patch we have:
> >
> > default:
> > addCachedSelectivityOtherClause(&cslist, var, s2);
> > break;
> >
> > I was unsure if this should only apply to F_EQSEL and F_NEQSEL. My
> > imagination here won't stretch far enough to imagine a OpExpr which
> > passes with a NULL operand. If it did my logic is broken, and if
> > that's the case then I think limiting "Others" to mean F_EQSEL and
> > F_NEQSEL would be the workaround.
>
> While not specifically applicable only to "Others", something needs
> consideration here:
>
> User-defined functions can be nonstrict. An OpExpr involving such a
> user-defined function could possibly pass on null.

Good point. I overlooked this.

> I would suggest avoiding making the assumption that it doesn't unless
> the operator is strict.
>
> One could argue that such an operator should provide its own
> selectivity estimator, but the strictness check shouldn't be too
> difficult to add, and I think it's a better choice.
>
> So you'd have to check that:
>
> default:
> if (op_strict(expr->opno) && func_strict(expr->opfuncid))
> addCachedSelectivityOtherClause(&cslist, var, s2);
> else
> s1 = s1 * s2;
> break;
>

I've changed it to something along those lines. I don't think the
func_strict here is required, though, so I've gone with just
op_strict().

Updated patch attached.

Thanks for reviewing it.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
smarter_clausesel_2017-04-03.patch application/octet-stream 19.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-04-03 09:13:45 Re: pg_partman 3.0.0 - real-world usage of native partitioning and a case for native default
Previous Message Tatsuo Ishii 2017-04-03 08:24:09 Re: Statement timeout behavior in extended queries