Re: NOT IN subquery optimization

From: "Li, Zheng" <zhelli(at)amazon(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: NOT IN subquery optimization
Date: 2020-03-26 20:58:24
Message-ID: 27025942-49F1-4349-80BE-405E81FC9C8C@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>BTW, so far as I can see, the only reason you're bothering with the whole
thing is to compare the size of the subquery output with work_mem, because
that's all that subplan_is_hashable does. I wonder whether that
consideration is even still necessary in the wake of 1f39bce02. If it is,
I wonder whether there isn't a cheaper way to figure it out. (Note
similar comment in make_subplan.)

The comment in make_subplan says there is no cheaper way to figure out:
/* At present, however, we can only check hashability after
* we've made the subplan :-(. (Determining whether it'll fit in work_mem
* is the really hard part.)
*/

I don't see why commit 1f39bce02 is related to this problem. Can you expand on this?

>But can't you detect that case directly? It seems like you'd need to
figure out the NULL situation anyway to know whether the transformation
to antijoin is valid in the first place.

Yes, we do need to figure out the NULL situation, and there is always valid transformation
to antijoin, it's just in the NULL case we need to stuff additional clause to the anti join
condition, and in these cases the transformation actually outperforms Subplan (non-hashed),
but underperforms the hashed Subplan. The unmodified anti hash join has similar performance
compared to hashed Subplan.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-26 20:59:49 Re: plan cache overhead on plpgsql expression
Previous Message Stephen Frost 2020-03-26 20:44:14 Re: backup manifests