|From:||Andres Freund <andres(at)anarazel(dot)de>|
|To:||Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>|
|Cc:||Richard Guo <riguo(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org|
|Subject:||Re: Implement predicate propagation for non-equivalence clauses|
|Views:||Raw Message | Whole Thread | Download mbox|
On 2018-11-22 19:15:42 +0300, Alexander Kuzmenkov wrote:
> Hi Richard,
> I took a look at the v2, here are some comments:
> * This feature needs tests, especially for the cases where opfamilies or
> data types or collations don't match, and other non-obvious cases where it
> shouldn't work.
> * Deducing an inequality to a constant is not always helpful. If we know
> that a = b and a = const1 and b < const2, we will deduce b = const1 from the
> EC, and adding b < const2 doesn't improve selectivity and only makes the
> cost estimate worse. One situation where it does make sense is when we can
> detect contradictions in these clauses, e.g. if we know that const1 > const2
> and therefore know that the above selection clause is always false. Looking
> at the regression test changes, I see that v2 doesn't do that. I think the
> handling of deduced inequalities shoud be modeled on the flow of
> generate_base_implied_equalities -> process_implied_equality ->
> distribute_qual_to_rels. This would allow us to correctly handle a deduced
> const1 < const2 clause and turn it into a gating Result qual.
> * There are functions named generate_implied_quals, gen_implied_quals and
> gen_implied_qual. The names are confusingly similar, we could use something
> like generate_implied_quals_for_clause and generate_one_implied_qual for the
> latter two.
> @@ gen_implied_quals
> else if (clause && IsA(clause, ScalarArrayOpExpr))
> * When can the clause be NULL?
> @@ gen_implied_quals
> item1 = canonicalize_ec_expression(item1,
> exprType((Node *) item1),
> item2 = canonicalize_ec_expression(item2,
> exprType((Node *) item2),
> * Why do we do this? If the collation or type of the original expression
> isn't right and we have to add RelabelType, the resulting expression won't
> match the original clause and we won't be able to substitute it with other
> equavalence members. So we can just check that the type and collation match.
> * In gen_implied_quals, I'd rename item1 => left, item2 => right, em1 =>
> orig_em, em2 => other_em, and same for list cells and types. As it is now,
> em1 can actually match both item1 and item2, and em2 is not related to
> either of them, so it took me some effort to understand what's going on.
> * In gen_implied_qual, why do we search the clause for a matching
> subexpression? Reading the thread, I thought that we can only do the
> substitution for OpExprs of the same opfamilies as the generating EC. This
> code looks like it can do the substitution an at arbitrary depth, so we
> might change an argument of some unsuitable function, and the result will
> not be correct. What we should probably do is that after we matched one side
> of OpExpr to one EC member, we just replace it with another suitable member
> and add the resulting clause.
> @@ gen_implied_qual
> * If the clause has a mergejoinable operator, set the EquivalenceClass
> * links. Otherwise, a mergejoinable operator with NULL left_ec/right_ec
> * will cause update_mergeclause_eclasses fails at assertion.
> if (new_rinfo->mergeopfamilies)
> initialize_mergeclause_eclasses(root, new_rinfo);
> * It's not an equality clause, so it is not going to be mergejoinable nor
> hashjoinable and we can skip these checks.
As this review has not been addressed, I'm marking this patch as
returned with feedback for now. Please resubmit once these are
|Next Message||Andres Freund||2019-02-03 10:06:45||Re: Online verification of checksums|
|Previous Message||Andres Freund||2019-02-03 09:39:13||Re: NOTIFY and pg_notify performance when deduplicating notifications|