Skip site navigation (1) Skip section navigation (2)

Re: Problem with CVS HEAD's handling of mergejoins

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Problem with CVS HEAD's handling of mergejoins
Date: 2008-01-09 19:13:22
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
I wrote:
> A perhaps less invasive idea is to discard any proposed mergeclauses
> that are redundant in this sense.  This would still require some
> reshuffling of responsibility between select_mergejoin_clauses and
> the code in pathkeys.c, since right now select_mergejoin_clauses
> takes no account of that.  However, I'm worried that that might result
> in planner failure on some FULL JOIN cases that work today, since we
> require all the join clauses to be mergejoinable for a FULL JOIN.
> People seem to complain when the planner fails, even for really
> stupid queries ;-).  I think this would only be acceptable if we can
> prove that ignoring clauses that are redundant in this sense doesn't
> change the result --- which might be the case, but I'm not sure.

On further study, this doesn't seem nearly as bad as it looked.  I had
hastily misread some of the existing code as assuming one-for-one
correspondence of mergeclauses and pathkeys, but actually it just
assumes that the pathkeys list contains at least one pathkey matching
each mergeclause.  Redundancy between two pathkeys in a list is
eliminated by allowing adjacent mergeclauses to share the same pathkey.
So that part actually all works, and the only problem is when an
equivalence class is redundant-for-sorting in itself --- that is,
when one of the eclass members is a constant.  So that explains why
we'd not seen it before; except in very odd corner cases, the old code
would have eliminated the mergejoinable clause anyway.

If we simply make select_mergejoin_clauses() reject a clause as
not-mergejoinable if either side is equated to a constant, then
the problems go away.  We'd have a new problem if this meant that
we fail to do a FULL JOIN, but AFAICS that can't happen: a variable
coming from a full join can't be equivalenced to a constant, except
perhaps in a below_outer_join eclass, which isn't considered redundant
for sorting anyway.

Bottom line is that it just takes another half dozen lines of code
to fix this; attached is the core of the fix (there are some
uninteresting prototype changes and macro-moving as well).

> I think I can fix this in a day or so, but I now definitely feel that
> we'll need an RC2 :-(

I no longer think that about this problem, but we've still got some
nasty-looking issues in xml.c, plus Hannes Dorbath's unsolved reports
of GIST/GIN problems ...

			regards, tom lane

+ 		/*
+ 		 * Insist that each side have a non-redundant eclass.  This
+ 		 * restriction is needed because various bits of the planner expect
+ 		 * that each clause in a merge be associatable with some pathkey in a
+ 		 * canonical pathkey list, but redundant eclasses can't appear in
+ 		 * canonical sort orderings.  (XXX it might be worth relaxing this,
+ 		 * but not enough time to address it for 8.3.)
+ 		 *
+ 		 * Note: it would be bad if this condition failed for an otherwise
+ 		 * mergejoinable FULL JOIN clause, since that would result in
+ 		 * undesirable planner failure.  I believe that is not possible
+ 		 * however; a variable involved in a full join could only appear
+ 		 * in below_outer_join eclasses, which aren't considered redundant.
+ 		 *
+ 		 * This *can* happen for left/right join clauses, however: the
+ 		 * outer-side variable could be equated to a constant.  (Because we
+ 		 * will propagate that constant across the join clause, the loss of
+ 		 * ability to do a mergejoin is not really all that big a deal, and
+ 		 * so it's not clear that improving this is important.)
+ 		 */
+ 		cache_mergeclause_eclasses(root, restrictinfo);
+ 		if (EC_MUST_BE_REDUNDANT(restrictinfo->left_ec) ||
+ 			EC_MUST_BE_REDUNDANT(restrictinfo->right_ec))
+ 		{
+ 			have_nonmergeable_joinclause = true;
+ 			continue;			/* can't handle redundant eclasses */
+ 		}
  		result_list = lappend(result_list, restrictinfo);

In response to

pgsql-hackers by date

Next:From: Tim YardleyDate: 2008-01-09 19:20:18
Subject: tzdata issue on cross-compiled postgresql
Previous:From: Gregory StarkDate: 2008-01-09 19:05:07
Subject: Re: OUTER JOIN performance regression remains in 8.3beta4

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group