Re: Function scan FDW pushdown

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: solaimurugan vellaipandiyan <drsolaimurugan(dot)v(at)gmail(dot)com>
Cc: Álvaro Herrera <alvherre(at)kurilemu(dot)de>, g(dot)kashkin(at)postgrespro(dot)ru, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Function scan FDW pushdown
Date: 2026-05-08 06:03:32
Message-ID: b91abe323f12f4210a53feb1ead3ac0d@postgrespro.ru
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

solaimurugan vellaipandiyan писал(а) 2026-05-08 08:17:
> Hi,
>
> I tested the v3 patch on current master using a postgres_fdw loopback
> setup.
>
> The patch mostly applied cleanly for me, with only a small manual
> conflict resolution needed in postgres_fdw.c (missing
> optimizer/clauses.h include). PostgreSQL built and started
> successfully after that.
>
> For testing, I used queries involving generate_series() together with
> a foreign table, for example:
>
> SELECT *
> FROM ft
> JOIN generate_series(1,3) g
> ON ft.id = g;
>
> I also tried:
> 1. implicit join syntax
> 2. LATERAL usage
> 3. disabling hashjoin/mergejoin to force different join paths
>
> In all cases, the plans still showed a local Function Scan and local
> join execution. The Remote SQL remained:
> SELECT id FROM public.localtab
> So I wasn't able to observe function scan pushdown with these
> testcases.
> Maybe I'm missing a query shape or planner condition required to
> trigger the new functionality. It would help to know which cases are
> currently expected to be pushed down.
>
> Regards,
> Solaimurugan V

Hi. Of course, this depends on foreign scan cost. Foreign join should be
cheaper than local one.
The cost of foreign join consists from
1) cost of evaluating inner and outer part of the join,
2) total costs of retreiving rows (nrows * cost of qual evaluation),
3) cost of preparing to evaluate quals (remote and local),
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.
Look at estimate_path_cost_size().

The cost of retreiving foreign rows consists from
1) cost of estimating baserestrictinfos
2) cost of remote sequential scan
3) cost of evaluating foreign relation target list
4) fdw_startup_cost
5) (fdw_tuple_cost + cpu_tuple_cost) * retrieved_rows.

If we compare function scan pushdown for foreign table and function with
foreign join pushdown of two foreign tables, we don't have two
penalties from fdw_startup_cost here. So, for function scan pushdown to
be chosen, it should filter out significant part of data, as data
transfer cost likely dominates in foreign scan costs estimation. You can
see this on the following toy example.

CREATE EXTENSION postgres_fdw;

SELECT current_database() AS current_database,
current_setting('port') AS current_port \gset
CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname :'current_database', port :'current_port');
CREATE USER MAPPING FOR current_user SERVER loopback;
CREATE table t1 (i int, j text);
create foreign table ft1 (i int, j text) server loopback options
(table_name 't1');

-- No adequate data about relation, we assume 1000+ rows in table, so
foreign join is chosen.

# explain select * from ft1 , generate_series(1,10) g where i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..172.16 rows=68 width=40)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)

-- We have actual data, there's no use to push down foreign join
postgres=# analyze ft1;
ANALYZE
postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
-----------------------------------------------------------------------------
Hash Join (cost=100.02..100.16 rows=1 width=40)
Hash Cond: (g.g = ft1.i)
-> Function Scan on generate_series g (cost=0.00..0.10 rows=10
width=4)
-> Hash (cost=100.00..100.00 rows=1 width=36)
-> Foreign Scan on ft1 (cost=100.00..100.00 rows=1 width=36)

-- If we had two foreign scans below join here, its startup cost would
be more than 2*100.
-- But evaluating function locally saves us from such great penalty.
-- Let's add some actual data.

postgres=# insert into ft1 select i, 'test'||i from
generate_series(1,1000) i;
INSERT 0 1000
postgres=# analyze ft1;
ANALYZE

-- Now we have actual data, transfering 1000 rows (plain foreign scan
costs ~ 410) and joining locally is more expensive than doing
-- remote join and transfering 10 rows.

postgres=# explain select * from ft1 , generate_series(1,10) g where
i=g;
QUERY PLAN
------------------------------------------------------
Foreign Scan (cost=100.00..143.20 rows=10 width=15)
Relations: (ft1) INNER JOIN (generate_series() g)
(2 rows)

--
Best regards,
Alexander Pyhalov,
Postgres Professional

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2026-05-08 06:07:13 Support for 8-byte TOAST values, round two
Previous Message vignesh C 2026-05-08 05:53:50 Re: Include schema-qualified names in publication error messages.