From: | Bradley Baetz <bbaetz(at)acm(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: New hashed IN code ignores distinctiveness of subquery |
Date: | 2003-01-27 03:22:57 |
Message-ID: | 20030127032257.GA3391@mango.home |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
On Sun, Jan 26, 2003 at 09:43:18PM -0500, Tom Lane wrote:
> We're already checking that as a separate plan alternative. The
> implementation could be improved a little, though --- we could combine
> the uniq-ification into the Hash node.
Right, or skip it entirely when selecting stuff with unique constraints.
> AFAICS there are two or three different concepts of selectivity being
> tossed about here.
<snip>
Ah, OK. Yeah, I think I was confusing myself there.
>
> It might be that we have to bite the bullet and set up a different
> selectivity estimation procedure for IN. I'd prefer to avoid that
> but haven't really figured out if it's necessary or not.
I don't think it is. The number of rows is correct if you do product_id
IN (1) vs product_id IN (1,2) vs product_id IN (1,2,3) and so on. The
only difference for a subselect is that the number of values generated
is only known approximately, rather than exactly, but I don't think that
thats relevent.
> In the meantime, I think you are right that it's bogus that
> set_joinrel_size_estimates uses different equations for JOIN_IN and
> JOIN_UNIQUE_INNER. Whatever the decision is about how to do the
> estimation, these should be done the same way.
Hmm. So, I made that change, and now get an estimate of 50218 rows for
the join result, which is about right (although capping the result to
have a maximum of the number of input rows is valid for an inner join),
but the cost for the hashjoin is still 4281586.50. cost_hashjoin
probably needs to be taught about the short circuiting done for _IN, but
I'm not sure where to do that from a quick glance through the code.
What is the point of JOIN_UNIQUE_{INNER,OUTER}, though? What does it do
that JOIN_IN doesn't? executor/* doesn't appear to use it.
> regards, tom lane
Bradley
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-01-27 04:18:31 | Re: New hashed IN code ignores distinctiveness of subquery |
Previous Message | Tom Lane | 2003-01-27 02:43:18 | Re: New hashed IN code ignores distinctiveness of subquery |
From | Date | Subject | |
---|---|---|---|
Next Message | Tatsuo Ishii | 2003-01-27 04:06:03 | Re: Sorting Chinese data in Postgresql 7.3.1 |
Previous Message | Tom Lane | 2003-01-27 02:43:18 | Re: New hashed IN code ignores distinctiveness of subquery |