Re: pg14 psql broke \d datname.nspname.relname

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: "Hamlin, Garick L" <ghamlin(at)isc(dot)upenn(dot)edu>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg14 psql broke \d datname.nspname.relname
Date: 2021-11-04 16:08:31
Message-ID: BFF8D9DD-4FBC-4E0F-9123-A88A57ECE211@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Nov 4, 2021, at 6:37 AM, Hamlin, Garick L <ghamlin(at)isc(dot)upenn(dot)edu> wrote:
>
>> If we pass the dots through to the POSIX regular expression, we can
>> only do that either for the table name or the schema name, not both -
>> either the first or last dot must mark the boundary between the two.
>> That means that you can't use all the same regexy things for one as
>> you can for the other, which is a strange system. I knew that your
>> patch made it do that, and I committed it that way because I didn't
>> think it really mattered, and also because the whole system is already
>> pretty strange, so what's one more bit of strangeness?
>
> Rather than trying to guess at the meaning of each '.' based on the total
> string. I wonder, if we could for v15 require '.' to be spelled in longer way
> if it needs to be treated as part of the regex.

We're trying to fix an edge case, not change how the basic case works. Most users are accustomed to using patterns from within psql like:

\dt myschema.mytable

Whatever patch we accept must not break these totally normal and currently working cases.

> Perhaps requiring something like '(.)' be used rather than a bare '.'
> might be good enough and documenting otherwise it's really a separator?
> I suppose we could also invent a non-standard class as a stand in like
> '[::any::]', but that seems kinda weird.

If I understand you, that would require the above example to be written as:

\dt myschema(.)mytable

which nobody expects to have to do, and which would be a very significant breaking change in v15. I can't see anything like that being accepted.

> I think it might be possible to give better error messages long term
> if we knew what '.' should mean without looking at the whole thing.

You quote a portion of an email from Robert. After that email, there were several more, and a new patch. The commit message of the new patch explains what it does. I wonder if you'd review that message, quoted here, or even better, review the entire patch. Does this seem like an ok fix to you?

Subject: [PATCH v2] Reject patterns with too many parts or wrong db

Object name patterns used by pg_dump and psql potentially contain
multiple parts (dotted names), and nothing prevents users from
specifying a name with too many parts, nor specifying a
database-qualified name for a database other than the currently
connected database. Prior to PostgreSQL version 14, pg_dump,
pg_dumpall and psql quietly discarded extra parts of the name on the
left. For example, `pg_dump -t` only expected a possibly schema
qualified table name, not a database name, and the following command

pg_dump -t production.marketing.customers

quietly ignored the "production" database name with neither warning
nor error. Commit 2c8726c4b0a496608919d1f78a5abc8c9b6e0868 changed
the behavior of name parsing. Where names contain more than the
maximum expected number of dots, the extra dots on the right were
interpreted as part of the name, such that the above example was
interpreted as schema=production, relation=marketing.customers.
This turns out to be highly unintuitive to users.

We've had reports that users sometimes copy-and-paste database- and
schema-qualified relation names from the logs.
https://www.postgresql.org/message-id/20211013165426.GD27491%40telsasoft.com

There is no support for cross database references, but allowing a
database qualified pattern when the database portion matches the
current database, as in the above report, seems more friendly than
rejecting it, so do that. We don't allow the database portion
itself to be a pattern, because if it matched more than one database
(including the current one), there would be confusion about which
database(s) were processed.

Consistent with how we allow db.schemapat.relpat in pg_dump and psql,
also allow db.schemapat for specifying schemas, as:

\dn mydb.myschema

in psql and

pg_dump --schema=mydb.myschema

Fix the pre-v14 behavior of ignoring leading portions of patterns
containing too many dotted names, and the v14.0 misfeature of
combining trailing portions of such patterns, and instead reject
such patterns in all cases by raising an error.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-11-04 16:30:00 Re: WIP: expression evaluation improvements
Previous Message Jeff Davis 2021-11-04 16:03:52 Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.