Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org, casey(dot)shobe(at)messagesystems(dot)com
Subject: Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct
Date: 2012-01-12 00:01:01
Message-ID: 29570.1326326461@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Andres Freund <andres(at)anarazel(dot)de> writes:
> Whats your opinion on this?

Looks pretty bogus to me. You're essentially assuming that the side of
the join without statistics is unique, which is a mighty dubious
assumption. (In cases where we *know* it's unique, something like this
could be reasonable, but I believe get_variable_numdistinct already
accounts for such cases.) The reason for the reversion to pre-8.4
behavior was that with the other behavior, we might sometimes make
extremely optimistic estimates (ie, conclude that the join result is
very small) on the basis of, really, nothing at all. AFAICS this
proposal just reintroduces unwarranted assumptions, and therefore will
probably produce as many worse results as better ones.

Also, why the asymmetry in null handling? And why did you only touch
one of the two code paths in eqjoinsel_semi? They have both got this
issue of how to estimate with inadequate stats.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Andres Freund 2012-01-12 00:40:34 Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct
Previous Message Tom Lane 2012-01-11 23:24:05 Re: Weird message when creating PK constraint named like table