Re: NOT IN subquery optimization

From: David Steele <david(at)pgmasters(dot)net>
To: "Li, Zheng" <zhelli(at)amazon(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: NOT IN subquery optimization
Date: 2020-04-08 13:32:09
Message-ID: 494152fa-d959-cccd-bdc0-076c35f05c7f@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/26/20 4:58 PM, Li, Zheng wrote:
> >BTW, so far as I can see, the only reason you're bothering with the whole
> thing is to compare the size of the subquery output with work_mem, because
> that's all that subplan_is_hashable does. I wonder whether that
> consideration is even still necessary in the wake of 1f39bce02. If it is,
> I wonder whether there isn't a cheaper way to figure it out. (Note
> similar comment in make_subplan.)
>
> The comment in make_subplan says there is no cheaper way to figure out:
> /* At present, however, we can only check hashability after
> * we've made the subplan :-(. (Determining whether it'll fit in work_mem
> * is the really hard part.)
> */
>
> I don't see why commit 1f39bce02 is related to this problem. Can you expand on this?
>
> >But can't you detect that case directly? It seems like you'd need to
> figure out the NULL situation anyway to know whether the transformation
> to antijoin is valid in the first place.
>
> Yes, we do need to figure out the NULL situation, and there is always valid transformation
> to antijoin, it's just in the NULL case we need to stuff additional clause to the anti join
> condition, and in these cases the transformation actually outperforms Subplan (non-hashed),
> but underperforms the hashed Subplan. The unmodified anti hash join has similar performance
> compared to hashed Subplan.

There seem to be enough questions about this implementation that I think
it makes sense to mark this patch Returned with Feedback.

Feel free to resubmit it to a future CF when there is more of a
consensus on the implementation.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-08 13:43:35 Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Previous Message Jonathan S. Katz 2020-04-08 13:26:42 Re: Improving connection scalability: GetSnapshotData()