Re: TRUNCATE on foreign table

From: Kazutaka Onishi <onishi(at)heterodb(dot)com>
To: Kohei KaiGai <kaigai(at)heterodb(dot)com>
Cc: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: TRUNCATE on foreign table
Date: 2021-04-04 06:30:23
Message-ID: CAJuF6cMqL8TzHVJxGTEKL3ijcNW7q5sNnpnjxoc6KJ82TYQRsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

For now, I've fixed the v8 according to your comments, excluding
anything related with 'hash table' and 'do_sql_commands'.

> 1) We usually have the struct name after "+typedef struct
> ForeignTruncateInfo", please refer to other struct defs in the code
> base.

I've modified the definition.
By the way, there're many "typedef struct{ ... }NameOfStruct;" in
codes, about 40% of other struct defs (checked by find&grep),
thus I felt the way is not "MUST".

> 2) We should add ORDER BY clause(probably ORDER BY id?) for data
> generating select queries in added tests, otherwise tests might become
> unstable.

I've added "ORDER BY" at the postges_fdw test.

> 3) How about dropping the tables, foreign tables that got created for
> testing in postgres_fdw.sql?

I've added "cleanup" commands.

> 4) I think it's not "foreign-tables"/"foreign-table", it can be
> "foreign tables"/"foreign table", other places in the docs use this
> convention.
> + the context where the foreign-tables are truncated. It is a list
> of integers and same length with

I've replaced "foreign-table" to "foreign table".

> 5) Can't we use do_sql_command function after making it non static? We
> could go extra mile, that is we could make do_sql_command little more
> generic by passing some enum for each of PQsendQuery,
> PQsendQueryParams, PQsendQueryPrepared and PQsendPrepare and replace
> the respective code chunks with do_sql_command in postgres_fdw.c.

I've skipped this for now.
I feel it sounds cool, but not easy.
It should be added by another patch because it's not only related to TRUNCATE.

> 6) A white space error when the patch is applied.
> contrib/postgres_fdw/postgres_fdw.c:2913: trailing whitespace.

I've checked the patch and clean spaces.
But I can't confirmed this message by attaching(patch -p1 < ...) my v8 patch.
If this still occurs, please tell me how you attach the patch.

> 7) I may be missing something here. Why do we need a hash table at
> all? We could just do it with a linked list right? Is there a specific
> reason to use a hash table? IIUC, the hash table entries will be lying
> around until the local session exists since we are not doing
> hash_destroy.

I've skipped this for now.

> 8) How about having something like this?
> + <command>TRUNCATE</command> can be used for foreign tables if the
> foreign data wrapper supports, for instance, see <xref
> linkend="postgres-fdw"/>.

Sounds good. I've added.

9)
> + <command>TRUNCATE</command> for each foreign server being involved
>
> + in one <command>TRUNCATE</command> command (note that invocations
> The 'being' in above sentence can be omitted.

I've fixed this.

10)
> + the context where the foreign-tables are truncated. It is a list of integers and same length with
> There should be a verb between 'and' and same :
> It is a list of integers and has same length with

I've fixed this.

11)
> + * Information related to truncation of foreign tables. This is used for
> + * the elements in a hash table that uses the server OID as lookup key,
> The 'uses' is for 'This' (the struct), so 'that' should be 'and':
> the elements in a hash table and uses
> Alternatively:
> the elements in a hash table. It uses

I've fixed this.

12)
> + relids_extra = lappend_int(relids_extra, (recurse ? TRUNCATE_REL_CONTEXT__NORMAL : TRUNCATE_REL_CONTEXT__ONLY));
> I am curious: isn't one underscore enough in the identifier (before NORMAL and ONLY) ?
> I suggest naming them TRUNCATE_REL_CONTEXT_NORMAL and TRUNCATE_REL_CONTEXT_ONLY

> + relids_extra = lappend_int(relids_extra, TRUNCATE_REL_CONTEXT__CASCADED);
> I wonder if TRUNCATE_REL_CONTEXT_CASCADING is better than TRUNCATE_REL_CONTEXT__CASCADED. Note the removal of the extra underscore.
> In English, we say: truncation cascading to foreign table.
> w.r.t. Bharath's question on using hash table, I think the reason is that the search would be more efficient:

I've changed these labels shown below:
TRUNCATE_REL_CONTEXT__NORMAL -> TRUNCATE_REL_CONTEXT_NORMAL
TRUNCATE_REL_CONTEXT__ONLY -> TRUNCATE_REL_CONTEXT_ONLY
TRUNCATE_REL_CONTEXT__CASCADED -> TRUNCATE_REL_CONTEXT_CASCADING

14)
> + ft_info = hash_search(ft_htab, &server_oid, HASH_ENTER, &found);
> and
> + while ((ft_info = hash_seq_search(&seq)) != NULL)
> + * Now go through the hash table, and process each entry associated to the
> + * servers involved in the TRUNCATE.
> associated to -> associated with

I've fixed this.

14) Should the hash table be released at the end of ExecuteTruncateGuts() ?

I've skipped this for now.

2021年4月4日(日) 14:13 Kohei KaiGai <kaigai(at)heterodb(dot)com>:
>
> 2021年4月4日(日) 13:07 Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>:
> >
> > On Sat, Apr 3, 2021 at 8:31 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
> > > w.r.t. Bharath's question on using hash table, I think the reason is that the search would be more efficient:
> >
> > Generally, sequential search would be slower if there are many entries
> > in a list. Here, the use case is to store all the foreign table ids
> > associated with each foreign server and I'm not sure how many foreign
> > tables will be provided in a single truncate command that belong to
> > different foreign servers. I strongly feel the count will be less and
> > using a list would be easier than to have a hash table. Others may
> > have better opinions.
> >
> https://www.postgresql.org/message-id/20200115081126.GK2243@paquier.xyz
>
> It was originally implemented using a simple list, then modified according to
> the comment by Michael.
> I think it is just a matter of preference.
>
> > > Should the hash table be released at the end of ExecuteTruncateGuts() ?
> >
> > If we go with a hash table and think that the frequency of "TRUNCATE"
> > commands on foreign tables is heavy in a local session, then it does
> > make sense to not destroy the hash, otherwise destroy the hash.
> >
> In most cases, TRUNCATE is not a command frequently executed.
> So, exactly, it is just a matter of preference.
>
> Best regards,
> --
> HeteroDB, Inc / The PG-Strom Project
> KaiGai Kohei <kaigai(at)heterodb(dot)com>

Attachment Content-Type Size
pgsql14-truncate-on-foreign-table.v9.patch application/octet-stream 153.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2021-04-04 07:16:30 Re: [PATCH] Implement motd for PostgreSQL
Previous Message Joel Jacobson 2021-04-04 06:23:56 Re: [PATCH] Implement motd for PostgreSQL