Re: proposal: possibility to read dumped table's name from file

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: possibility to read dumped table's name from file
Date: 2021-09-17 11:51:34
Message-ID: CAFj8pRAUchyBYqCPMV+iXMyae532vWb9tSWEGWfSv8OuEmeYzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pá 17. 9. 2021 v 13:42 odesílatel Daniel Gustafsson <daniel(at)yesql(dot)se>
napsal:

> > On 15 Sep 2021, at 19:31, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
> > po 13. 9. 2021 v 15:01 odesílatel Daniel Gustafsson <daniel(at)yesql(dot)se
> <mailto:daniel(at)yesql(dot)se>> napsal:
>
> > One issue with this syntax is that the include keyword can be quite
> misleading
> > as it's semantic interpretion of "include table t" can be different from
> > "--table=t". The former is less clear about the fact that it means
> "exclude
> > all other tables than " then the latter. It can be solved with
> documentation,
> > but I think that needs be to be made clearer.
> >
> > I invite any documentation enhancing and fixing
>
> Sure, that can be collabored on. This gist is though that IMO the
> keywords in
> the filter file aren't as clear on the sideeffects as the command line
> params,
> even though they are equal in functionality.
>
> > + pg_log_error("invalid format of filter file \"%s\": %s",
> > + filename,
> > + message);
> > +
> > + fprintf(stderr, "%d: %s\n", lineno, line);
> > Can't we just include the lineno in the error logging and skip dumping
> the
> > offending line? Fast-forwarding the pointer to print the offending part
> is
> > less useful than a context marker, and in some cases suboptimal. With
> this
> > coding, if a pattern is omitted for example the below error message is
> given:
> >
> > pg_dump: error: invalid format of filter file "filter.txt": missing
> object name
> > 1:
> >
> > The errormessage and the linenumber in the file should be enough for the
> user
> > to figure out what to fix.
> >
> > I did it like you proposed, but still, I think the content can be useful.
>
> Not when there is no content in the error message, printing an empty
> string for
> a line number which isn't a blank line doesn't seem terribly helpful. If
> we
> know the error context is empty, printing a tailored error hint seems more
> useful for the user.
>
> > More times you read dynamically generated files, or you read data from
> stdin, and in complex environments it can be hard regenerate new content
> for debugging.
>
> That seems odd given that the arguments for this format has been that it's
> likely to be handwritten.
>
> > If I create a table called "a\nb" and try to dump it I get an error in
> parsing
> > the file. Isn't this supposed to work?
> > $ cat filter.txt
> > include table "a
> > b"
> > $ ./bin/pg_dump --filter=filter.txt
> > pg_dump: error: invalid format of filter file "filter.txt":
> unexpected chars after object name
> > 2:
> >
> > probably there was some issue, because it should work. I tested a new
> version and this is tested in new regress tests. Please, check
>
> That seems to work, but I am unable to write a filter statement which can
> handle this relname:
>
> CREATE TABLE "a""
> ""b" (a integer);
>
> Are you able to craft one for that?
>

I am not able to dump this directly in pg_dump. Is it possible?

> > Did you consider implementing this in Bison to abstract some of the
> messier
> > parsing logic?
> >
> > Initially not, but now, when I am thinking about it, I don't think so
> Bison helps. The syntax of the filter file is nicely linear. Now, the code
> of the parser is a little bit larger than minimalistic, but it is due to
> nicer error's messages. The raw implementation in Bison raised just "syntax
> error" and positions. I did code refactoring, and now the scanning, parsing
> and processing are divided into separated routines. Parsing related code
> has 90 lines. In this case, I don't think using a parser grammar file can
> carry any benefit. grammar is more readable, sure, but we need to include
> bison, we need to handle errors, and if we want to raise more helpful
> errors than just "syntax error", then the code will be longer.
>
> I'm not so concerned by code size, but rather parsing of quotations etc and
> being able to reason about it's correctness. IMHO that's easier done by
> reading a defined grammar than parsing a handwritten parser.
>
> Will do a closer review on the patch shortly.
>
> --
> Daniel Gustafsson https://vmware.com/
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-09-17 11:56:46 Re: proposal: possibility to read dumped table's name from file
Previous Message Pavel Stehule 2021-09-17 11:42:55 Re: proposal: possibility to read dumped table's name from file