Re: TRUNCATE on foreign table

From: Kohei KaiGai <kaigai(at)heterodb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kazutaka Onishi <onishi(at)heterodb(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: TRUNCATE on foreign table
Date: 2021-04-08 04:43:25
Message-ID: CAOP8fzajfK4vUe0s0YAOehjaFxvcecdvv2JdzeR6mZ-V3fLVbw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2021年4月8日(木) 11:44 Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>:
>
> On 2021/04/08 10:56, Kohei KaiGai wrote:
> > 2021年4月8日(木) 4:19 Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>:
> >>
> >> On 2021/04/06 21:06, Kazutaka Onishi wrote:
> >>> Thank you for checking v13, and here is v14 patch.
> >>>
> >>>> 1) Are we using all of these macros? I see that we are setting them
> >>>> but we only use TRUNCATE_REL_CONTEXT_ONLY. If not used, can we remove
> >>>> them?
> >>>
> >>> These may be needed for the foreign data handler other than postgres_fdw.
> >>
> >> Could you tell me how such FDWs use TRUNCATE_REL_CONTEXT_CASCADING and _NORMAL? I'm still not sure if TRUNCATE_REL_CONTEXT_CASCADING is really required.
> >>
> > https://www.postgresql.org/message-id/20200102144644.GM3195%40tamriel.snowman.net
> >
> > This is the suggestion when I added the flag to inform cascading.
> >
> > | .... Instead, I'd suggest we have the core code build
> > | up a list of tables to truncate, for each server, based just on the list
> > | passed in by the user, and then also pass in if CASCADE was included or
> > | not, and then let the FDW handle that in whatever way makes sense for
> > | the foreign server (which, for a PG system, would probably be just
> > | building up the TRUNCATE command and running it with or without the
> > | CASCADE option, but it might be different on other systems).
> > |
> > Indeed, it is not a strong technical reason at this moment.
> > (And, I also don't have idea to distinct these differences in my module also.)
>
> CASCADE option mentioned in the above seems the CASCADE clause specified in TRUNCATE command. No? So the above doesn't seem to suggest to include the information about how each table to truncate is picked up. Am I missing something?
>
It might be a bit different context.

> >
> >> With the patch, both inherited and referencing relations are marked as TRUNCATE_REL_CONTEXT_CASCADING? Is this ok for that use? Or we should distinguish them?
> >>
> > In addition, even though my prior implementation distinguished and deliver
> > the status whether the truncate command is issued with NORMAL or ONLY,
> > does the remote query by postgres_fdw needs to follow the manner?
> >
> > Please assume the case when a foreign-table "ft" that maps a remote table
> > with some child-relations.
> > If we run TRUNCATE ONLY ft at the local server, postgres_fdw setup
> > a remote truncate command with "ONLY" qualifier, then remote postgresql
> > server truncate only parent table of the remote side.
> > Next, "SELECT * FROM ft" command returns some valid rows from the
> > child tables in the remote side, even if it is just after TRUNCATE command.
> > Is it a intuitive behavior for users?
>
> Yes, because that's the same behavior as for the local tables. No?
>
No. ;-p

When we define a foreign-table as follows,

postgres=# CREATE FOREIGN TABLE ft (id int, v text)
SERVER loopback OPTIONS (table_name 't_parent',
truncatable 'true');
postgres=# select * from ft;
id | v
----+-------------------
1 | 1 in the parent
2 | 2 in the parent
3 | 3 in the parent
4 | 4 in the parent
11 | 11 in the child_1
12 | 12 in the child_1
13 | 13 in the child_1
21 | 21 in the child_2
22 | 22 in the child_2
23 | 23 in the child_2
(10 rows)

TRUNCATE ONLY eliminates the rows come from parent table on the remote side,
even though this foreign table has no parent-child relationship in the
local side.

postgres=# begin;
BEGIN
postgres=# truncate only ft;
TRUNCATE TABLE
postgres=# select * from ft;
id | v
----+-------------------
11 | 11 in the child_1
12 | 12 in the child_1
13 | 13 in the child_1
21 | 21 in the child_2
22 | 22 in the child_2
23 | 23 in the child_2
(6 rows)

postgres=# abort;
ROLLBACK

In case when a local table (with no children) has same contents,
TRUNCATE command
witll remove the entire table contents.

postgres=# select * INTO tt FROM ft;
SELECT 10
postgres=# select * from tt;
id | v
----+-------------------
1 | 1 in the parent
2 | 2 in the parent
3 | 3 in the parent
4 | 4 in the parent
11 | 11 in the child_1
12 | 12 in the child_1
13 | 13 in the child_1
21 | 21 in the child_2
22 | 22 in the child_2
23 | 23 in the child_2
(10 rows)

postgres=# truncate only tt;
TRUNCATE TABLE
postgres=# select * from tt;
id | v
----+---
(0 rows)

> If this understanding is true, the following note that the patch added is also intuitive, and not necessary? At least "partition leafs" part should be removed because TRUNCATE ONLY fails if the remote table is a partitioned table.
>
> + Pay attention for the case when a foreign table maps remote table
> + that has inherited children or partition leafs.
> + <command>TRUNCATE</command> specifies the foreign tables with
> + <literal>ONLY</literal> clause, remove queries over the
> + <filename>postgres_fdw</filename> also specify remote tables with
> + <literal>ONLY</literal> clause, that will truncate only parent
> + portion of the remote table. In the results, it looks like
> + <command>TRUNCATE</command> command partially eliminated contents
> + of the foreign tables.
>
Base on the above assumption, I don't think it should be a part of
documentation.
On the other hands, we need to describe this API requires FDW driver to wipe out
the entire data on behalf of the foreign tables once they are picked up by the
ExecuteTruncate().

> > Even though we have discussed about the flags and expected behavior of
> > foreign truncate, strip of the relids_extra may be the most straight-forward
> > API design.
> > So, in other words, the API requires FDW driver to make the entire data
> > represented by the foreign table empty, by ExecForeignTruncate().
> > It is probably more consistent to look at DropBehavior for listing-up the
> > target relations at the local relations only.
> >
> > How about your thought?
>
> I was thinking to remove only TRUNCATE_REL_CONTEXT_CASCADING if that's really not necessary. That is, rels_extra is still used to indicate whether each table is specified with ONLY option or not. To do this, we can use _NORMAL and _ONLY. Or we can also make that as the list of boolean flag (indicating whether ONLY is specified or not).
>
I'm inclined to eliminate relids_extra list itself, because FDW
drivers don't need to
distinguish the CASCADING, NORMAL or ONLY cases.
The ExecForeignTruncate receives a list of foreign tables that is already
expanded by the ExecuteTruncate(), thus, all the FDW driver shall do is
just wipe out entire data mapped to the individual foreign tables
Also, FDW driver don't need to know DropBehavior.

> > If we stand on the above design, ExecForeignTruncate() don't needs
> > frels_extra and behavior arguments.
> >
> >> +#define TRUNCATE_REL_CONTEXT_NORMAL 0x01
> >> +#define TRUNCATE_REL_CONTEXT_ONLY 0x02
> >> +#define TRUNCATE_REL_CONTEXT_CASCADING 0x04
> >>
> >> With the patch, these are defined as flag bits. But ExecuteTruncate() seems to always set the entry in relids_extra to either of them, not the combination of them. So we can define them as enum?
> >>
> > Regardless of my above comment, It's a bug.
> > When list_member_oid(relids, myrelid) == true, we have to set proper flag on the
> > relevant frels_extra member, not just ignoring.
>
> One concern about this is that local tables are not processed that way. For local tables, the information (whether ONLY is specified or not) of the table found first is used. For example, when we execute "TRUNCATE ONLY tbl, tbl" and "TRUNCATE tbl, ONLY tbl", the former truncates only parent table because "ONLY tbl" is found first. But the latter truncates the parent and all inherited tables because "tbl" is found first.
>
> If even foreign table follows this manner, current patch's logic seems right.
>
-1. :-(
It should be fixed, even if we try to deliver the relids_extra list.

Best regards,
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai(at)heterodb(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2021-04-08 04:54:39 Re: ModifyTable overheads in generic plans
Previous Message Amul Sul 2021-04-08 04:38:35 Re: CREATE SEQUENCE with RESTART option