Re: Expression push down from Join Node to below node.

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Shubhankar Anand Kulkarni <shubhankar(dot)ak(at)zohocorp(dot)com>
Cc: "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>, "shubhankarkul00" <shubhankarkul00(at)gmail(dot)com>
Subject: Re: Expression push down from Join Node to below node.
Date: 2025-05-30 01:23:13
Message-ID: 87y0ue7uym.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Shubhankar Anand Kulkarni <shubhankar(dot)ak(at)zohocorp(dot)com> writes:

Hi,

> As seen, the expression got pushed down to respective foreign scan, reducing the overall query time reduced
> significantly, but there is an increase in the memory footprint.

That's great on the query time improvement. When you say the memory
footprint or we have two scans because of the CTE (one is build the cte,
one is the cte scan), I admit it is not good, hence there is a better
result for your query.

> Also the thing which you mentioned,
>
> I guess the reason would be once we push the function down to the "foreign scan"
> node, we need to run these function *before any other filter*, which may
> increase the number of calls of the function. e.g.
>
> After looking into the code, it looks like we will first evaluate the qual, if it qualifies then only we will go for the
> projection. Meaning, even after expression push-down we will evaluate the expression on top filtered rows only. So I
> don't think it should have any concern in this case.

I am not sure which case you are talking about. In my previously answer,
I have a example and explaination for this idea, but you didn't metion
it when you were replying it.

"""
SELECT udf1(t1.a) FROM t1_1000row t1, t2_1row t2 where t2.fid = t1.id;

If we push down the udf1 to the timing of scaning t1, udf1 would be
called 1000 times, but without the push down (*based on t1_1000row has
1000 rows*), it is called 1 times in the above case (*based on afrer the
join, there are only 1 row returned *).

IIRC, PostgreSQL assumes after the join, the total rows will be less.
"""
>
> Also for this particular case, I tried playing around create_hashjoin_plan function in createplan.c, to push my
> hash-clause expression to below foreign-scan. Currently, if there is a single hash clause of form funcExpr op Var, I am
> appending the funcExpr to pathTarget of respective table and replacing the funcExpr with a var. Is it a right
> place/approach to check or should we do changes while parsing in deconstruct_jointree(), in distiribute_quals_to_rels we
> can update the appropriate baserestrictinfo.
> Pls share your thoughts on the same.

Hash Cond invovles two sides of the relations, how could we push down
it under one side of it? I can't follow up on this well, have you a draft
plan / code for your idea?

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2025-05-30 01:38:20 Re: [WIP]Vertical Clustered Index (columnar store extension) - take2
Previous Message Nikolay Samokhvalov 2025-05-30 01:19:01 Re: Proposal: Job Scheduler