Re: Implement predicate propagation for non-equivalence clauses

From: Andres Freund <andres(at)anarazel(dot)de>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: Richard Guo <riguo(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implement predicate propagation for non-equivalence clauses
Date: 2019-02-03 10:04:51
Message-ID: 20190203100451.zwkl4umvzccfgtze@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

On 2018-11-22 19:15:42 +0300, Alexander Kuzmenkov wrote:
> Hi Richard,
>
> I took a look at the v2, here are some comments:
>
> * This feature needs tests, especially for the cases where opfamilies or
> data types or collations don't match, and other non-obvious cases where it
> shouldn't work.
>
>
> * Deducing an inequality to a constant is not always helpful. If we know
> that a = b and a = const1 and b < const2, we will deduce b = const1 from the
> EC, and adding b < const2 doesn't improve selectivity and only makes the
> cost estimate worse. One situation where it does make sense is when we can
> detect contradictions in these clauses, e.g. if we know that const1 > const2
> and therefore know that the above selection clause is always false. Looking
> at the regression test changes, I see that v2 doesn't do that. I think the
> handling of deduced inequalities shoud be modeled on the flow of
> generate_base_implied_equalities -> process_implied_equality ->
> distribute_qual_to_rels. This would allow us to correctly handle a deduced
> const1 < const2 clause and turn it into a gating Result qual.
>
>
> * There are functions named generate_implied_quals, gen_implied_quals and
> gen_implied_qual. The names are confusingly similar, we could use something
> like generate_implied_quals_for_clause and generate_one_implied_qual for the
> latter two.
>
>
> @@ gen_implied_quals
>     else if (clause && IsA(clause, ScalarArrayOpExpr))
>     {
> * When can the clause be NULL?
>
>
> @@ gen_implied_quals
>     item1 = canonicalize_ec_expression(item1,
>                                        exprType((Node *) item1),
>                                        collation);
>     item2 = canonicalize_ec_expression(item2,
>                                        exprType((Node *) item2),
>                                        collation);
> * Why do we do this? If the collation or type of the original expression
> isn't right and we have to add RelabelType, the resulting expression won't
> match the original clause and we won't be able to substitute it with other
> equavalence members. So we can just check that the type and collation match.
>
>
> * In gen_implied_quals, I'd rename item1 => left, item2 => right, em1 =>
> orig_em, em2 => other_em, and same for list cells and types. As it is now,
> em1 can actually match both item1 and item2, and em2 is not related to
> either of them, so it took me some effort to understand what's going on.
>
>
> * In gen_implied_qual, why do we search the clause for a matching
> subexpression? Reading the thread, I thought that we can only do the
> substitution for OpExprs of the same opfamilies as the generating EC. This
> code looks like it can do the substitution an at arbitrary depth, so we
> might change an argument of some unsuitable function, and the result will
> not be correct. What we should probably do is that after we matched one side
> of OpExpr to one EC member, we just replace it with another suitable member
> and add the resulting clause.
>
>
> @@ gen_implied_qual
>     check_mergejoinable(new_rinfo);
>     check_hashjoinable(new_rinfo);
> ...
>     /*
>      * If the clause has a mergejoinable operator, set the EquivalenceClass
>      * links. Otherwise, a mergejoinable operator with NULL left_ec/right_ec
>      * will cause update_mergeclause_eclasses fails at assertion.
>      */
>     if (new_rinfo->mergeopfamilies)
>         initialize_mergeclause_eclasses(root, new_rinfo);
> * It's not an equality clause, so it is not going to be mergejoinable nor
> hashjoinable and we can skip these checks.

As this review has not been addressed, I'm marking this patch as
returned with feedback for now. Please resubmit once these are
addressed!

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-02-03 10:06:45 Re: Online verification of checksums
Previous Message Andres Freund 2019-02-03 09:39:13 Re: NOTIFY and pg_notify performance when deduplicating notifications