Re: Option to dump foreign data in pg_dump

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Luis Carril <luis(dot)carril(at)swarm64(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Option to dump foreign data in pg_dump
Date: 2019-11-09 20:38:55
Message-ID: BC1853E7-BFCD-4C71-86D1-E1037E64BBD9@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers



On 20 Sep 2019, at 17:20, Luis Carril <luis(dot)carril(at)swarm64(dot)com> wrote:

I took a look at this patch again today for a review of the latest version.

While I still think it's a potential footgun due to read-only FDW's, I can see

usecases for having it so I'm mildly +1 on adding it.

The patch applies to master with a little bit of fuzz and builds without warnings.

Regarding the functionality, it's a bit unfortunate that it works differently

for --inserts dumps and COPY dumps.  As an example, suppose we have a file_fdw

table in CSV format with 10 rows where row 10 is malformed.  The COPY dump will

include 9 rows and exit on an error, the --inserts dump won't include any rows

before the error.  Since the dump fails with an error, one can argue that it

doesn't matter too much, but it's still not good to have such different

behaviors based on program internals.  (the example is of course not terribly

realistic but can be extrapolated from.) Maybe I'm the only one concerned though.



*I suggest the option to be just –foreign-data because if we make it –include-foreign-data its expected to have –exclude-foreign-data option too.

Several pg_dump options have no counterpart, e.g --enable-row-security does not have a disable (which is the default). Also calling it --foreign-data would sound similar to the --table,  by default all tables are dumped, but with --table only the selected tables are dumped. While without --include-foreign-data all data is excluded, and only with the option some foreign data would be included.

I agree that --include-foreign-data conveys the meaning of the option better,

+1 for keeping this.



*please add test case

I added tests cases for the invalid inputs. I'll try to make a test case for the actual dump of foreign data, but that requires more setup, because a functional fdw is needed there.

This is where it becomes a bit messy IMO.  Given that there has been a lot of

effort spent on adding test coverage for pg_dump, I think it would be a shame

to add such niche functionality without testing more than the program options.

You are however right that in order to test, it requires a fully functional

FDW.

I took the liberty to add a testcase to the pg_dump TAP tests which includes a

dummy FDW that always return 10 predetermined rows (in order to keep tests

stable).  There is so far just a happy-path test, since I don't want to spend

time until it's deemed of interest (it is adding a lot of code for a small

test), but it at least illustrates how this patch could be tested.  The

attached patch builds on top of yours.




The below check may not be required:

+ if (strcmp(optarg, "") == 0)

+ {

+ pg_log_error("empty string is not a valid pattern in --include-foreign-data");

+ exit_nicely(1);

+ }


We need to conserve this check to avoid that the use of '--include-foreign-data=', which would match all foreign servers. And in previous messages it was established that that behavior is too coarse.

I still believe thats the desired functionality.

Also, a few small nitpicks on the patch:

This should probably be PATTERN instead of SERVER, to match the rest of the

help output:

+   printf(_("  --include-foreign-data=SERVER\n"

+            "                               include data of foreign tables with the named\n"

+            "                               foreign servers in dump\n"));

It would be good to add a comment explaining the rationale for adding

RELKIND_FOREIGN_TABLE to this block, to assist readers:

-   if (tdinfo->filtercond)

+   if (tdinfo->filtercond || tbinfo->relkind == RELKIND_FOREIGN_TABLE)

cheers ./daniel


In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-11-09 21:13:13 Re: base backup client as auxiliary backend process
Previous Message Tom Lane 2019-11-09 20:34:46 Re: proposal: minscale, rtrim, btrim functions for numeric