Re: identity columns

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: identity columns
Date: 2016-09-12 23:59:39
Message-ID: CAKOSWNnnqK-_jP2DMtLHtNhk0DikjCo8qxKQcofR3Lb5ApFubw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/12/16, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> Thank you for this extensive testing. I will work on getting the bugs
> fixed. Just a couple of comments on some of your points:
>
> On 9/9/16 11:45 PM, Vitaly Burovoy wrote:
>> It compiles and passes "make check" tests, but fails with "make
>> check-world" at:
>> test foreign_data ... FAILED
>
> I do not see that. You can you show the diffs?

I can't reproduce it, it is my fault, may be I did not clean build dir.

>> 1. The standard requires "... ALTER COLUMN ... SET GENERATED { ALWAYS
>> | BY DEFAULT }" (9075-2:2011 subcl 11.20), but the patch implements it
>> as "... ALTER COLUMN ... ADD GENERATED { ALWAYS | BY DEFAULT } AS
>> IDENTITY"
>
> SET and ADD are two different things. The SET command just changes the
> parameters of the underlying sequence.

Well... As for me ADD is used when you can add more than one property
of the same kind to a relation (e.g. column or constraint), but SET is
used when you change something and it replaces previous state (e.g.
NOT NULL, DEFAULT, STORAGE, SCHEMA, TABLESPACE etc.)

You can't set ADD more than one IDENTITY to a column, so it should be "SET".

> This can be implemented later and doesn't seem so important now.

Hmm. Now you're passing params to CreateSeqStmt because they are the same.
Is it hard to pass them to AlterSeqStmt (if there is no SET GENERATED")?

> The ADD command is not in the standard, but I needed it for pg_dump, mainly.
> I will need to document this.

Firstly, why to introduce new grammar which differs from the standard
instead of follow the standard?
Secondly, I see no troubles to follow the standard:
src/bin/pg_dump/pg_dump.c:
- "ALTER COLUMN %s ADD GENERATED ",
+ "ALTER COLUMN %s SET GENERATED ",

src/backend/parser/gram.y:
- /* ALTER TABLE <name> ALTER [COLUMN] <colname> ADD GENERATED ... AS
IDENTITY ... */
- | ALTER opt_column ColId ADD_P GENERATED generated_when AS
IDENTITY_P OptParenthesizedSeqOptList
- c->options = $9;
+ /* ALTER TABLE <name> ALTER [COLUMN] <colname> SET GENERATED ... */
+ | ALTER opt_column ColId SET GENERATED generated_when OptSeqOptList
- c->options = $7;

I guess "ALTER opt_column ColId SET OptSeqOptList" is easy to be
implemented, after some research "ALTER opt_column ColId RESTART [WITH
...]" also can be added.

(and reflected in the docs)

>> 14. It would be fine if psql has support of new clauses.
>
> What do you mean by that? Tab completion?

Yes, I'm about it. Or tab completion is usually developed later?

>> 16. I think it is a good idea to not raise exceptions for "SET
>> GENERATED/DROP IDENTITY" if a column has the same type of identity/not
>> an identity. To be consistent with "SET/DROP NOT NULL".
>
> These behaviors are per SQL standard.

Can you point to concrete rule(s) in the standard?

I could not find it in ISO/IEC 9075-2 subclauses 11.20 "<alter
identity column specification>" and 11.21 "<drop identity property
clause>".
Only subclause 4.15.11 "Identity columns" says "The columns of a base
table BT can optionally include not more than one identity column."
(which you don't follow).

For instance, subclause 11.42 <drop character set statement>, General
Rules p.1 says explicitly about exception.
Or (for columns): 11.4 <column definition>, General Rules p.3: "The
<column name> in the <column definition> SHALL NOT be equivalent to
the <column name> of any other column of T."

> --
> Peter Eisentraut http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Several additional thoughts:
1. I think it is wise to add ability to set name of a sequence (as
PG's extension of the standard) to SET GENERATED or GENERATED in a
relation definition (something like CONSTRAINTs names), without it it
is hard to fix conflicts with other sequences (e.g. from serial pseudo
type) and manual changes of the sequence ("alter seq rename", "alter
seq set schema" etc.).
2. Is it useful to rename sequence linked with identity constraint
when table is renamed (similar way when sequence moves to another
schema following the linked table)?
3. You're setting OWNER to a sequence, but what about USAGE privilege
to roles have INSERT/UPDATE privileges to the table? For security
reasons table is owned by a role different from roles which are using
the table (to prevent changing its definition).

--
Best regards,
Vitaly Burovoy

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-09-12 23:59:41 Re: PATCH: Exclude additional directories in pg_basebackup
Previous Message Andres Freund 2016-09-12 23:58:06 Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)