Re: proposal: possibility to read dumped table's name from file

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: possibility to read dumped table's name from file
Date: 2021-09-13 13:01:56
Message-ID: B854A710-E1DE-4C6B-A411-E246A95DCDF9@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 28 Jul 2021, at 09:28, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> út 13. 7. 2021 v 1:16 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us <mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us>> napsal:

> Hence I suggest
>
> include table PATTERN
> exclude table PATTERN
>
> which ends up being the above but with words not [+-].

One issue with this syntax is that the include keyword can be quite misleading
as it's semantic interpretion of "include table t" can be different from
"--table=t". The former is less clear about the fact that it means "exclude
all other tables than " then the latter. It can be solved with documentation,
but I think that needs be to be made clearer.

> Here is an updated implementation of filter's file, that implements syntax proposed by you.

While it's not the format I would prefer, it does allow for most (all?) use
cases expressed in this thread with ample armtwisting applied so let's go ahead
from this point and see if we can agree on it (or a version of it).

A few notes on the patch after a first pass over it:

+(include|exclude)[table|schema|foreign_data|data] <replaceable class="parameter">objectname</replaceable>
Lacks whitespace between keyword and object type. Also, since these are
mandatory parameters, shouldn't they be within '{..}' ?

+ /* skip initial white spaces */
+ while (isblank(*ptr))
+ ptr += 1;
We don't trust isblank() as of 3fd5faed5 due to portability concerns, this
should probably use a version of the pg_isblank() we already have (and possibly
move that to src/common/string.c as there now are more consumers).

+static bool
+isblank_line(const char *line)
This could be replaced with a single call to strspn() as we already do for
parsing the TOC file.

+ /* when first char is hash, ignore whole line */
+ if (*str == '#')
+ continue;
I think we should strip leading whitespace before this to allow commentlines to
start with whitespace, it's easy enough and will make life easier for users.

+ pg_log_error("invalid format of filter file \"%s\": %s",
+ filename,
+ message);
+
+ fprintf(stderr, "%d: %s\n", lineno, line);
Can't we just include the lineno in the error logging and skip dumping the
offending line? Fast-forwarding the pointer to print the offending part is
less useful than a context marker, and in some cases suboptimal. With this
coding, if a pattern is omitted for example the below error message is given:

pg_dump: error: invalid format of filter file "filter.txt": missing object name
1:

The errormessage and the linenumber in the file should be enough for the user
to figure out what to fix.

+ if (keyword && is_keyword(keyword, size, "table"))
+ objecttype = 't';
Should this use an enum, or at least a struct translation the literal keyword
to the internal representation? Magic constants without explicit connection to
their token counterparts can easily be cause of bugs.

If I create a table called "a\nb" and try to dump it I get an error in parsing
the file. Isn't this supposed to work?
$ cat filter.txt
include table "a
b"
$ ./bin/pg_dump --filter=filter.txt
pg_dump: error: invalid format of filter file "filter.txt": unexpected chars after object name
2:

Did you consider implementing this in Bison to abstract some of the messier
parsing logic?

--
Daniel Gustafsson https://vmware.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-09-13 13:04:28 Re: [PATCH] test/ssl: rework the sslfiles Makefile target
Previous Message Fabrízio de Royes Mello 2021-09-13 12:48:45 Is SPI + Procedures (with COMMIT) inside a bgworker broken?