Quick Links

Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off)

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Mario Weilguni" <mario(dot)weilguni(at)icomedias(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off)
Date:	2005-11-28 18:39:52
Message-ID:	13031.1133203192@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

"Mario Weilguni" <mario(dot)weilguni(at)icomedias(dot)com> writes:
> Thanks for the quick response, I've tried the patch, but it did not work
> as expected. When I set enable_hashjoin to off, everything works as
> expected, but with hashjoin on I do not even get results anymore, CPU is
> going up to 100% and after 3 minutes I cancelled the query (it normale
> would take ~100-500 milliseconds).

Try letting it run longer. I think your expectation is tuned for the
broken implementation (which runs the subqueries only once instead of
26k times...)

The test case I developed for this failure in the regression database is

select count(*) from tenk1 a
where exists (select 1 from tenk1 b, tenk1 c
where b.unique1=c.unique2 and
b.hundred in (4,5) and c.hundred=a.hundred+99);

8.0 prefers a nestloop for the subquery, and that plan runs in about
600 ms on my machine. If forced to a hash join, it takes about 2450 ms.
8.1 prefers the hash join to start with, but takes 11300 ms to run it :-(
(after the patch that is).

The reason for the differential is that 8.1 guesses wrong about which
subplan to cycle first: most of the time, the inner plan is empty and
so there's no need to pull any rows from the outer plan, but 8.1 pulls
the first row from the outer plan anyway, and doing that 10000 times is
what's eating the extra runtime. It looks from your previous message
that similar things are happening with your data distribution, allowing
8.0 to run faster for you than 8.1 does.

Not sure if there's much we can do about this. The presence of the
upper-query parameter in the subplan makes it difficult to derive any
stats at all, let alone guess how often the subplan will be completely
empty, so I'm not sure the planner can help.

For a query like this, where the hash join is being done repeatedly,
it might be useful for the executor itself to track how often each
subplan has been seen to be empty. In particular, the executor knows
that the outer subplan is parameterless and therefore should deliver
the same results each time (modulo volatile functions of course), so
after the first cycle it could know that there's no point in trying
the early fetch on that side. Dunno if this will be of wide enough
use to be worth implementing though --- in simple cases the join
won't be rescanned and so the executor can't help.

Anyone have any other ideas?

regards, tom lane

In response to

Re: Getting different number of results when using hashjoin on/off at 2005-11-28 17:41:09 from Mario Weilguni

Responses

Re: Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off) at 2005-11-28 23:49:40 from Tom Lane
Re: Hashjoin startup strategy (was Re: Getting different number of results when using hashjoin on/off) at 2005-11-29 00:25:58 from Greg Stark

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2005-11-28 18:46:53	Re: Help: 8.0.3 Vacuum of an empty table never completes ...
Previous Message	James Robinson	2005-11-28 17:47:20	Re: Help: 8.0.3 Vacuum of an empty table never completes ...