Re: ERROR: left and right pathkeys do not match in mergejoin

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: ERROR: left and right pathkeys do not match in mergejoin
Date: 2018-02-22 18:03:13
Message-ID: 20796.1519322593@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru> writes:
> explain select * from j1_tbl full join (select * from j2_tbl order by
> j2_tbl.i desc, j2_tbl.k) j2_tbl on j1_tbl.i = j2_tbl.i and j1_tbl.i =
> j2_tbl.k;
> ERROR: left and right pathkeys do not match in mergejoin

Nice example. There are several places that we could consider trying
to fix this:

The first possibility is to teach find_mergeclauses_for_pathkeys that
it should not select both of the join clauses in a case like this,
perhaps based on noting that the associated pathkeys have the same
eclass but different sort orders. However, that seems like a loser
for the reason specified in the comment in that function: we need to
select a maximal list of mergeclauses, not a minimal list, because
if it's a full join and we aren't able to include all the clauses as
mergeclauses then we won't have a valid plan. (At the time this code
was written, that could have resulted in failure to produce a plan
at all. Now we could fall back to a hash full join ... but that might
be inefficient, or we might not have hashable operators.)

The next possibility is to fix it in make_inner_pathkeys_for_merge,
by having it not suppress lower-order pathkeys unless they match earlier
ones in all details (not just eclass). In an example like this, that
means emitting a redundant inner pathkey list, equivalent to "ORDER BY
j1_tbl.i DESC, j1_tbl.i ASC". That's kind of annoying. It seems to
work in a simple test, but it implies doing useless comparisons during
the sort (since we'd only reach comparing the second sort column if
the first sort column compares equal, whereupon the second must too).
And it means representing the inner sort order by a noncanonical
pathkey list, which is not nice. For example, even if we have an inner
path available that delivers rows ordered by "j1_tbl.i DESC", we'd
still think we have to physically sort it to add the extra sort key.
(And no, I don't want to change pathkey comparison rules to avoid that.)

The third possibility is to decide that create_mergejoin_plan is being
overly paranoid and it's okay to extract merge details from a "redundant"
path key even though it specifies the opposite sort order from what the
current merge clause seems to need. This is scary at first glance, but
it seems like it should work. We'd essentially be lying to the executor
about the sort ordering of the lower-order merge column, and trusting that
it won't take any wrong actions as a result because it should never reach
comparison of that lower-order column except when the higher column(s)
are equal. I can't see a reason for that to go wrong; it's basically a
restatement of the argument why the lower-order sort key can be considered
redundant in the first place. But it doesn't leave a warm feeling in the
pit of the stomach, especially for a bug fix that needs to be back-patched
a long way.

On balance, though, this last choice seems like the thing to do. There
are clear downsides to the first two, in terms of failing to construct a
plan or constructing a needlessly inefficient plan.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-02-22 18:24:21 Re: Allow workers to override datallowconn
Previous Message Magnus Hagander 2018-02-22 18:01:35 Allow workers to override datallowconn