Re: BUG #5084: Query gives different number of rows depending on ORDER BY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: heikki(dot)linnakangas(at)enterprisedb(dot)com (Heikki Linnakangas), Bernt Marius Johnsen <bernt(dot)johnsen(at)sun(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5084: Query gives different number of rows depending on ORDER BY
Date: 2009-09-28 20:02:57
Message-ID: 13692.1254168177@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Notice that the (k = k) qual is being dropped somewhere, which changes
>> the output since that's a disguised not-null condition.

> Huh ... I'm surprised it only does that with the ORDER BY present.
> I suppose it's got something to do with the equivalence-class machinery.

After digging into it, I find that:

1. Without ORDER BY, process_equivalence generates an equivalence class
that lists k twice. This is pretty bogus but it happens to produce the
desired results in the example at hand. (In some other cases you'll get
redundant clauses out, because the eclass machinery isn't expecting
this.)

2. With ORDER BY k, the code first creates a single-element equivalence
class containing k, because it needs that to represent the desired
pathkey. Then, process_equivalence finds that both sides of the k = k
clause are already known to be in the same eclass, so it concludes that
this is redundant information.

I'm inclined to think that the best solution is to have
process_equivalence just reject any clauses that have equal() left and
right sides, ie, throw them back to be processed as ordinary
non-equivalence clauses. The only case I can think of where this might
be less than ideal is if you have "k = k AND k = x"; if both operators
are strict then the k = k test is indeed redundant and could be
discarded, but it won't be. But it doesn't seem like that's going to
come up enough to be worth stressing about. If we wanted to be smart
about it we'd have to have two kinds of single-element equivalence
classes (one that implies a k = k check is needed, and one that does
not). It doesn't seem worth the complication.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tim Bunce 2009-09-28 20:23:01 Re: BUG #5066: plperl issues with perl_destruct() and END blocks
Previous Message Tom Lane 2009-09-28 19:27:20 Re: BUG #5084: Query gives different number of rows depending on ORDER BY