Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: ashutosh(dot)bapat(at)enterprisedb(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, alvherre(at)2ndquadrant(dot)com, robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Date: 2018-09-14 13:01:39
Message-ID: 5B9BB133.1060107@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/08/24 16:58), Kyotaro HORIGUCHI wrote:
> At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in<20180821(dot)110132(dot)261184472(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
>>> You wrote:
>>>> Several places seems to be assuming that fdw_scan_tlist may be
>>>> used foreign scan on simple relation but I didn't find that
>>>> actually happens.
>>>
>>> Yeah, currently, postgres_fdw and file_fdw don't use that list for
>>> simple foreign table scans, but it could be used to improve the
>>> efficiency for those scans, as explained in fdwhandler.sgml:
> ...
>> I'll put more consideration on using fdw_scan_tlist in the
>> documented way.
>
> Done. postgres_fdw now generates full fdw_scan_tlist (as
> documented) for foreign relations with junk columns having a
> small change in core side. However it is far less invasive than
> the previous version and I believe that it dones't harm
> maybe-existing use of fdw_scan_tlist on non-join rels (that is,
> in the case of a subset of relation columns).

Yeah, changes to the core by the new version is really small, which is
great, but I'm not sure it's a good idea to modify the catalog info on
the target table on the fly:

@@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid
relationObjectId,\
bool inhparent,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot access temporary or unlogged relations
during r\
ecovery")));

+ max_attrnum = RelationGetNumberOfAttributes(relation);
+
+ /* Foreign table may have exanded this relation with junk columns */
+ if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE)
+ {
+ AttrNumber maxattno = max_varattno(root->parse->targetList, varno);
+ if (max_attrnum < maxattno)
+ max_attrnum = maxattno;
+ }
+
rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1;
- rel->max_attr = RelationGetNumberOfAttributes(relation);
+ rel->max_attr = max_attrnum;
rel->reltablespace = RelationGetForm(relation)->reltablespace;

This breaks the fundamental assumption that rel->max_attr is equal to
RelationGetNumberOfAttributes of that table. My concern is: this change
would probably be a wart, so it would be bug-prone in future versions.

Another thing on the new version:

@@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root,
RelOptInfo *rel)
relation = heap_open(rte->relid, NoLock);

numattrs = RelationGetNumberOfAttributes(relation);
+
+ /*
+ * Foreign tables may have expanded with some junk columns. Punt
+ * in the case.
+ */
+ if (numattrs < rel->max_attr)
+ {
+ Assert(root->simple_rte_array[rel->relid]->relkind ==
+ RELKIND_FOREIGN_TABLE);
+ heap_close(relation, NoLock);
+ break;
+ }

I think this would disable the optimization on projection in foreign
scans, causing performance regression.

> One arguable behavior change is about wholrow vars. Currently it
> refferes local tuple with all columns but it is explicitly
> fetched as ROW() after this patch applied. This could be fixed
> but not just now.
>
> Part of 0004-:
> - Output: f1, ''::text, ctid, rem1.*
> - Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE
> + Output: f1, ''::text, tableoid, ctid, rem1.*
> + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE

That would be also performance regression. If we go in this direction,
that should be fixed.

> Since this uses fdw_scan_tlist so it is theoretically
> back-patchable back to 9.6.

IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join
pushdown infrastructure, so I think your patch can be back-patched to
PG9.5, but I don't think that's enough; IIRC, this issue was introduced
in PG9.3, so a solution for this should be back-patch-able to PG9.3, I
think.

> Please find the attached three files.

Thanks for the patches!

> 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch
>
> This should fail for unpatched postgres_fdw. (Just for demonstration)

+CREATE TABLE p1 (a int, b int);
+CREATE TABLE c1 (LIKE p1) INHERITS (p1);
+CREATE TABLE c2 (LIKE p1) INHERITS (p1);
+CREATE FOREIGN TABLE fp1 (a int, b int)
+ SERVER loopback OPTIONS (table_name 'p1');
+INSERT INTO c1 VALUES (0, 1);
+INSERT INTO c2 VALUES (1, 1);
+SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS
toiddiff, ctid, * FROM fp1;

Does it make sense to evaluate toiddiff? I think that should always be 0.

> 0003-Fix-of-foreign-update-bug-of-PgFDW.patch
>
> Fix of postgres_fdw for this problem.

Sorry, I have not looked at it closely yet, but before that I'd like to
discuss the direction we go in. I'm not convinced that your approach is
the right direction, so as promised, I wrote a patch using the
Param-based approach, and compared the two approaches. Attached is a
WIP patch for that, which includes the 0003 patch. I don't think there
would be any warts as discussed above in the Param-based approach for
now. (That approach modifies the planner so that the targetrel's tlist
would contain Params as well as Vars/PHVs, so actually, it breaks the
planner assumption that a rel's tlist would only include Vars/PHVs, but
I don't find any issues on that at least for now. Will look into that
in more detail.) And I don't think there would be any concern about
performance regression, either. Maybe I'm missing something, though.

What do you think about that?

Note about the attached: I tried to invent a utility for
generate_new_param like SS_make_initplan_output_param as mentioned in
[1], but since the FDW API doesn't pass PlannerInfo to the FDW, I think
the FDW can't call the utility the same way. Instead, I modified the
planner so that 1) the FDW adds Params without setting PARAM_EXEC Param
IDs using a new function, and then 2) the core fixes the IDs.

Sorry for the delay.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/3919.1527775582%40sss.pgh.pa.us

Attachment Content-Type Size
fix-foreign-modify-efujita-WIP.patch text/x-diff 75.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-09-14 13:12:34 Re: [PATCH] Tab completion for ALTER DATABASE … SET TABLESPACE
Previous Message James Keener 2018-09-14 12:50:34 Re: Code of Conduct plan