join pushdown and issue with foreign update

From: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: join pushdown and issue with foreign update
Date: 2021-05-31 12:39:41
Message-ID: 7817fb9ebd6661cdf9b67dec6e129a78@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi.

There's issue with join pushdown after

commit 86dc90056dfdbd9d1b891718d2e5614e3e432f35
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Wed Mar 31 11:52:34 2021 -0400

Rework planning and execution of UPDATE and DELETE

To make sure that join pushdown path selected, one can patch
contrib/postgres_fdw/postgres_fdw.c in the following way:

diff --git a/contrib/postgres_fdw/postgres_fdw.c
b/contrib/postgres_fdw/postgres_fdw.c
index c48a421e88b..c2bf6833050 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -5959,6 +5959,8 @@ postgresGetForeignJoinPaths(PlannerInfo *root,
/* Estimate costs for bare join relation */
estimate_path_cost_size(root, joinrel, NIL, NIL, NULL,
&rows, &width,
&startup_cost, &total_cost);
+
+ startup_cost = total_cost = 0;
/* Now update this information in the joinrel */
joinrel->rows = rows;
joinrel->reltarget->width = width;

Now, this simple test shows the issue:

create extension postgres_fdw;

DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER
postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$')$$;
END;
$d$;

CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;

CREATE TABLE base_tbl (a int, b int);
CREATE FOREIGN TABLE remote_tbl (a int, b int)
SERVER loopback OPTIONS (table_name 'base_tbl');

insert into remote_tbl select generate_series(1,100),
generate_series(1,100);

explain verbose update remote_tbl d set a= case when current_timestamp>
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b)
WHERE d.a = (t.a);

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.remote_tbl d (cost=0.00..42.35 rows=0 width=0)
Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
-> Foreign Scan (cost=0.00..42.35 rows=8470 width=74)
Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02
00:00:00'::timestamp without time zone) THEN 5 ELSE 6 END, d.ctid, d.*,
t.*
Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl
t)
Remote SQL: SELECT r1.ctid, CASE WHEN (r1.*)::text IS NOT NULL
THEN ROW(r1.a, r1.b) END, CASE WHEN (r2.*)::text IS NOT NULL THEN
ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN public.base_tbl
r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
-> Merge Join (cost=433.03..566.29 rows=8470 width=70)
Output: d.ctid, d.*, t.*
Merge Cond: (d.a = t.a)
-> Sort (cost=211.00..214.10 rows=1241 width=42)
Output: d.ctid, d.*, d.a
Sort Key: d.a
-> Foreign Scan on public.remote_tbl d
(cost=100.00..147.23 rows=1241 width=42)
Output: d.ctid, d.*, d.a
Remote SQL: SELECT a, b, ctid FROM
public.base_tbl FOR UPDATE
-> Sort (cost=222.03..225.44 rows=1365 width=36)
Output: t.*, t.a
Sort Key: t.a
-> Foreign Scan on public.remote_tbl t
(cost=100.00..150.95 rows=1365 width=36)
Output: t.*, t.a
Remote SQL: SELECT a, b FROM public.base_tbl
update remote_tbl d set a= case when current_timestamp>
'2012-02-02'::timestamp then 5 else 6 end FROM remote_tbl AS t (a, b)
WHERE d.a = (t.a);

You'll get
ERROR: input of anonymous composite types is not implemented
CONTEXT: whole-row reference to foreign table "remote_tbl"

make_tuple_from_result_row() (called by fetch_more_data()), will try to
call InputFunctionCall() for ROW(r1.a, r1.b) and will get error in
record_in().

Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl, but ROW(r1.a, r1.b) would have atttypid 2249 (RECORD).

Before 86dc90056dfdbd9d1b891718d2e5614e3e432f35 the plan would be
different and looked like

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Update on public.remote_tbl d (cost=0.00..73.54 rows=14708 width=46)
Remote SQL: UPDATE public.base_tbl SET a = $2 WHERE ctid = $1
-> Foreign Scan (cost=0.00..73.54 rows=14708 width=46)
Output: CASE WHEN (CURRENT_TIMESTAMP > '2012-02-02
00:00:00'::timestamp without time zone) THEN d.a ELSE 6 END, d.b,
d.ctid, t.*
Relations: (public.remote_tbl d) INNER JOIN (public.remote_tbl
t)
Remote SQL: SELECT r1.a, r1.b, r1.ctid, CASE WHEN (r2.*)::text
IS NOT NULL THEN ROW(r2.a, r2.b) END FROM (public.base_tbl r1 INNER JOIN
public.base_tbl r2 ON (((r1.a = r2.a)))) FOR UPDATE OF r1
-> Merge Join (cost=516.00..747.39 rows=14708 width=46)
Output: d.a, d.b, d.ctid, t.*
Merge Cond: (d.a = t.a)
-> Sort (cost=293.97..299.35 rows=2155 width=14)
Output: d.a, d.b, d.ctid
Sort Key: d.a
-> Foreign Scan on public.remote_tbl d
(cost=100.00..174.65 rows=2155 width=14)
Output: d.a, d.b, d.ctid
Remote SQL: SELECT a, b, ctid FROM
public.base_tbl FOR UPDATE
-> Sort (cost=222.03..225.44 rows=1365 width=36)
Output: t.*, t.a
Sort Key: t.a
-> Foreign Scan on public.remote_tbl t
(cost=100.00..150.95 rows=1365 width=36)
Output: t.*, t.a
Remote SQL: SELECT a, b FROM public.base_tbl

Here ROW(r2.a, r2.b) would have attribute type id, corresponding to
remote_tbl.

--
Best regards,
Alexander Pyhalov,
Postgres Professional

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2021-05-31 13:02:14 Re: Decoding speculative insert with toast leaks memory
Previous Message Michael Paquier 2021-05-31 12:37:17 Re: Incorrect snapshots while promoting hot standby node when 2PC is used