Re: PoC: full merge join on comparison clause

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: full merge join on comparison clause
Date: 2017-09-19 13:18:04
Message-ID: CAFjFpRc0FGBqnB+DEB-6K08DA4KuEUFK1tKM8MNURf3z0qkf-Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi Alexander,

On Fri, Aug 25, 2017 at 10:11 PM, Alexander Kuzmenkov
<a(dot)kuzmenkov(at)postgrespro(dot)ru> wrote:
> Here is a new version of the patch, rebased to 749c7c41 and with some
> cosmetic changes.
>

I looked at this patch briefly. This is a useful feature. This isn't a
design level review of the patch. I may get back to that later. But
here are some assorted comments

The patch applies cleanly, but there are some whitespace errors.
src/backend/executor/nodeMergejoin.c:231: trailing whitespace.
+ /*
src/backend/executor/nodeMergejoin.c:232: trailing whitespace.
+ * Determine whether we accept lesser and/or equal tuples
src/backend/optimizer/path/joinpath.c:499: trailing whitespace.
+ * a merge join. A merge join executor can only choose inner values that are
src/backend/optimizer/path/joinpath.c:501: trailing whitespace.
+ * have at most one non-equality clause.

The implementation may change, so fixing the white space errors may
not be priority now. The patch compiles cleanly.

You have renamed RestrictInfo member mergeopfamilies as
equivopfamilies. I don't think that's a good name; it doesn't convey
that these are opfamilies containing merge operators. The changes in
check_mergejoinable() suggest that an operator may act as equality
operator in few operator families and comparison operator in others.
That looks odd. Actually an operator family contains operators other
than equality operators, so you may want to retain this member and add
a new member to specify whether the clause is an equality clause or
not.

In mergejoinscansel() you have just removed Assert(op_strategy ==
BTEqualStrategyNumber); Probably this function is written considering
on equality operators. But now that we are using this for all other
operators, we will need more changes to this function. That may be the
reason why INNER join in your earlier example doesn't choose right
costing.

The comment change in final_cost_mergejoin() needs more work. n1, n2,
n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 +
n2 + n3 + ... = size of inner relation is correct. In that context I
am not able to understand your change
+ * If the merge clauses contain inequality, (n1 + n2 + ...) ~=
+ * (size of inner relation)^2.

Some stylistic comments
+ switch (join_op_strategy)
+ {
+ case BTEqualStrategyNumber:
+ parent->mj_UseEqual[iClause] = true;
+ break;
+ case BTLessEqualStrategyNumber:
+ parent->mj_UseEqual[iClause] = true;
+ /* fall through */
+ case BTLessStrategyNumber:
+ parent->mj_UseLesser[iClause] = true;
+ break;
+ case BTGreaterEqualStrategyNumber:
+ parent->mj_UseEqual[iClause] = true;
+ /* fall through */
+ case BTGreaterStrategyNumber:
+ parent->mj_UseLesser[iClause] = true;
+ break;
+ default:
+ Assert(false);

Add blank lines between different cases and you may want to replace
Assert in default case into an elog(). See for example exprType() or
get_jointype_name().

+ if (sort_result < 0)
+ {
+ result = MJCR_NextOuter;
+ }

We usually do not add {} around a single statement block.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-09-19 13:21:56 Re: src/test/subscription/t/002_types.pl hanging on particular environment
Previous Message Amit Kapila 2017-09-19 13:08:33 Re: src/test/subscription/t/002_types.pl hanging on particular environment