Re: proposal: possibility to read dumped table's name from file

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Surafel Temesgen <surafel3000(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: proposal: possibility to read dumped table's name from file
Date: 2021-07-14 10:08:01
Message-ID: e33b7c84-40f2-8a3b-3197-2323b66dca83@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/14/21 2:18 AM, Stephen Frost wrote:
> Greetings,
>
> * Alvaro Herrera (alvherre(at)2ndquadrant(dot)com) wrote:
>> On 2021-Jul-13, Stephen Frost wrote:
>>> The simplest possible format isn't going to work with all the different
>>> pg_dump options and it still isn't going to be 'simple' since it needs
>>> to work with the flexibility that we have in what we support for object
>>> names,
>>
>> That's fine. If people want a mechanism that allows changing the other
>> pg_dump options that are not related to object filtering, they can
>> implement a configuration file for that.
>
> It's been said multiple times that people *do* want that and that they
> want it to all be part of this one file, and specifically that they
> don't want to end up with a file structure that actively works against
> allowing other options to be added to it.
>

I have no problem believing some people want to be able to specify
pg_dump parameters in a file, similarly to IMPDP/EXPDP parameter files
etc. That seems useful, but I doubt they considered the case with many
filter rules ... which is what "my people" want.

Not sure how keeping the filter rules in a separate file (which I assume
is what you mean by "file structure"), with a format tailored for filter
rules, works *actively* against adding options to the "main" config.

I'm not buying the argument that keeping some of the stuff in a separate
file is an issue - plenty of established tools do that, the concept of
"including" a config is not a radical new thing, and I don't expect we'd
have many options supported by a file.

In any case, I think user input is important, but ultimately it's up to
us to reconcile the conflicting requirements coming from various users
and come up with a reasonable compromise design.

>>> I don't know that the options that I suggested previously would
>>> definitely work or not but they at least would allow other projects like
>>> pgAdmin to leverage existing code for parsing and generating these
>>> config files.
>>
>> Keep in mind that this patch is not intended to help pgAdmin
>> specifically. It would be great if pgAdmin uses the functionality
>> implemented here, but if they decide not to, that's not terrible. They
>> have survived decades without a pg_dump configuration file; they still
>> can.
>
> The adding of a config file for pg_dump should specifically be looking
> at pgAdmin as the exact use-case for having such a capability.
>
>> There are several votes in this thread for pg_dump to gain functionality
>> to filter objects based on a simple specification -- particularly one
>> that can be written using shell pipelines. This patch gives it.
>
> And several votes for having a config file that supports, or at least
> can support in the future, the various options which pg_dump supports-
> and active voices against having a new file format that doesn't allow
> for that.
>

IMHO the whole "problem" here stems from the question whether there
should be a single universal pg_dump config file, containing everything
including the filter rules. I'm of the opinion it's better to keep the
filter rules separate, mainly because:

1) simplicity - Options (key/value) and filter rules (with more internal
structure) seem quite different, and mixing them in the same file will
just make the format more complex.

2) flexibility - Keeping the filter rules in a separate file makes it
easier to reuse the same set of rules with different pg_dump configs,
specified in (much smaller) config files.

So in principle, the "main" config could use e.g. TOML or whatever we
find most suitable for this type of key/value config file (or we could
just use the same format as for postgresql.conf et al). And the filter
rules could use something as simple as CSV (yes, I know it's not great,
but there's plenty of parsers, it handles multi-line strings etc.).

>>> I'm not completely against inventing something new, but I'd really
>>> prefer that we at least try to make something existing work first
>>> before inventing something new that everyone is going to have to deal
>>> with.
>>
>> That was discussed upthread and led nowhere.
>
> You're right- no one followed up on that. Instead, one group continues
> to push for 'simple' and to just accept what's been proposed, while
> another group counters that we should be looking at the broader design
> question and work towards a solution which will work for us down the
> road, and not just right now.
>

I have quite thick skin, but I have to admit I rather dislike how this
paints the people arguing for simplicity.

IMO simplicity is a perfectly legitimate (and desirable) design feature,
and simpler solutions often fare better in the long run. Yes, we need to
look at the broader design, no doubt about that.

> One thing remains clear- there's no consensus here.
>

True.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-07-14 10:11:59 Re: psql \copy from sends a lot of packets
Previous Message Julien Rouhaud 2021-07-14 09:46:45 Re: [HACKERS] Preserving param location