Re: proposal: possibility to read dumped table's name from file

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>
Subject: Re: proposal: possibility to read dumped table's name from file
Date: 2020-11-19 19:51:18
Message-ID: CAFj8pRBuEOCTGR8VwhQi3_dNg6v=g5zBB4kmACRsnAFiWJKWVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

út 17. 11. 2020 v 22:53 odesílatel Justin Pryzby <pryzby(at)telsasoft(dot)com>
napsal:

> On Wed, Nov 11, 2020 at 06:49:43AM +0100, Pavel Stehule wrote:
> > Perhaps this feature could co-exist with a full blown configuration for
> > >> pg_dump, but even then there's certainly issues with what's proposed-
> > >> how would you handle explicitly asking for a table which is named
> > >> " mytable" to be included or excluded? Or a table which has a
> newline
> > >> in it? Using a standardized format which supports the full range of
> > >> what we do in a table name, explicitly and clearly, would address
> these
> > >> issues and also give us the flexibility to extend the options which
> > >> could be used through the configuration file beyond just the filters
> in
> > >> the future.
>
> I think it's a reasonable question - why would a new configuration file
> option
> include support for only a handful of existing arguments but not the rest.
>

I don't see a strong technical problem - enhancing parsing is not hard
work, but I miss a use case for this. The option "--filter" tries to solve
a problem with limited command line size. This is a clean use case and
there and supported options are options that can be used repeatedly on the
command line. Nothing less, nothing more. The format that is used is
designed just for this purpose.

When we would implement an alternative configuration to command line and
system environments, then the use case should be defined first. When the
use case is defined, we can talk about implementation and about good
format. There are a lot of interesting formats, but I miss a reason why the
usage of this alternative configuration can be helpful for pg_dump. Using
external libraries for richer formats means a new dependency, necessity to
solve portability issues, and maybe other issues, and for this there should
be a good use case. Passing a list of tables for dumping doesn't need a
rich format.

I cannot imagine using a config file with generated object names and some
other options together. Maybe if these configurations will not be too long
(then handy written) configuration can be usable. But when I think about
using pg_dump from some bash scripts, then much more practical is using
usual command line options and passing a list of objects by pipe. I really
miss the use case for special pg_dump's config file, and if there is, then
it is very different from a use case for "--filter" option.

> > > This is the correct argument - I will check a possibility to use
> strange
> > > names, but there is the same possibility and functionality like we
> allow
> > > from the command line. So you can use double quoted names. I'll check
> it.
> >
> > I checked
> > echo "+t \"bad Name\"" | /usr/local/pgsql/master/bin/pg_dump
> --filter=/dev/stdin
> > It is working without any problem
>
> I think it couldn't possibly work with newlines, since you call
> pg_get_line().
> I realize that entering a newline into the shell would also be a PITA, but
> that
> could be one *more* reason to support a config file - to allow terrible
> table
> names to be in a file and avoid writing dash tee quote something enter else
> quote in a pg_dump command, or shell script.
>

New patch is working with names that contains multilines

[pavel(at)localhost postgresql.master]$ psql -At -X -c "select '+t ' ||
quote_ident(table_name) from information_schema.tables where table_name
like 'foo%'"| /usr/local/pgsql/master/bin/pg_dump --filter=/dev/stdin
--
-- PostgreSQL database dump
--

-- Dumped from database version 14devel
-- Dumped by pg_dump version 14devel

-
-- Name: foo boo; Type: TABLE; Schema: public; Owner: pavel
--

CREATE TABLE public."foo
boo" (
a integer
);

ALTER TABLE public."foo
boo" OWNER TO pavel;

--
-- Data for Name: foo boo; Type: TABLE DATA; Schema: public; Owner: pavel
--

COPY public."foo
boo" (a) FROM stdin;
\.

--
-- PostgreSQL database dump complete
--

> I fooled with argument parsing to handle reading from a file in the
> quickest
> way. As written, this fails to handle multiple config files, and special
> table
> names, which need to support arbitrary, logical lines, with quotes
> surrounding
> newlines or other special chars. As written, the --config file is parsed
> *after* all other arguments, so it could override previous args (like
> --no-blobs --no-blogs, --file, --format, --compress, --lock-wait), which I
> guess is bad, so the config file should be processed *during* argument
> parsing.
> Unfortunately, I think that suggests duplicating parsing of all/most the
> argument parsing for config file support - I'd be happy if someone
> suggested a
> better way.
>
> BTW, in your most recent patch:
> s/empty rows/empty lines/
> unbalanced parens: "invalid option type (use [+-]"
>

should be fixed now, thank you for check

Regards

Pavel

> @cfbot: I renamed the patch so please ignore it.
>
> --
> Justin
>

Attachment Content-Type Size
pg_dump-filter-option-20201119.patch text/x-patch 13.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-11-19 19:57:01 Re: proposal: possibility to read dumped table's name from file
Previous Message Peter Geoghegan 2020-11-19 19:47:53 Re: new heapcheck contrib module