Re: identity columns

From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: identity columns
Date: 2017-03-15 17:07:03
Message-ID: ce47104c-026a-e3bb-cdfe-68b2052016d6@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Vitaly, will you be able to review this again?

On 2/28/17 21:23, Peter Eisentraut wrote:
> New patch that fixes everything. ;-) Besides hopefully addressing all
> your issues, this version also no longer requires separate privileges on
> the internal sequence, which was the last outstanding functional issue
> that I am aware of. This version also no longer stores a default entry
> in pg_attrdef. Instead, the rewriter makes up the default expression on
> the fly. This makes some things much simpler.
>
> On 1/4/17 19:34, Vitaly Burovoy wrote:
>> 1. The fact COPY ignores GENERATED ALWAYS constraint (treats as
>> GENERATED BY DEFAULT) should be mentioned as well as rules.
>
> fixed (documentation added)
>
> What do you mean for rules?
>
>> 2. Usually error message for identical columns (with LIKE clause)
>> looks like this:
>> test=# CREATE TABLE def(i serial, n1 int NOT NULL, n2 int);
>> CREATE TABLE
>> test=# CREATE TABLE test(i serial, LIKE def INCLUDING ALL);
>> ERROR: column "i" specified more than once
>>
>> but for generated columns it is disorienting enough:
>> test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY);
>> CREATE TABLE
>> test=# CREATE TABLE test(i int GENERATED BY DEFAULT AS IDENTITY, LIKE
>> idnt INCLUDING ALL);
>> ERROR: relation "test_i_seq" already exists
>
> Yeah, this is kind of hard to fix though, because the sequence names
> are decided during parse analysis, when we don't see that the same
> sequence name is also used by another new column. We could maybe
> patch around it in an ugly way, but it doesn't seem worth it for this
> marginal case.
>
>> 3. Strange error (not about absence of a column; but see pp.5 and 8):
>> test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY;
>> ERROR: identity column type must be smallint, integer, or bigint
>
> What's wrong with that?
>
>> 4. Altering type leads to an error:
>> test=# ALTER TABLE idnt ALTER COLUMN i TYPE bigint;
>> ERROR: cannot alter data type of identity column "i"
>>
>> it is understandable for non-integers. But why do you forbid changing
>> to the supported types?
>
> fixed (is now allowed)
>
>> 5. Attempt to make a column be identity fails:
>> test=# ALTER TABLE idnt ALTER COLUMN n1 SET GENERATED BY DEFAULT;
>> ERROR: column "n1" of relation "idnt" is not an identity column
>>
>> As I understand from the Spec, chapter 11.20 General Rule 2)b) says
>> about the final state of a column without mentioning of the initial
>> state.
>> Therefore even if the column has the initial state "not generated",
>> after the command it changes the state to either "GENERATED ALWAYS" or
>> "GENERATED BY DEFAULT".
>
> 11.12 <alter column definition> states that "If <alter identity column
> specification> or <drop identity property clause> is specified, then C
> shall be an identity column." So this clause is only to alter an
> existing identity column, not make a new one.
>
>> 6. The docs mention a syntax:
>> ALTER [ COLUMN ] <replaceable
>> class="PARAMETER">column_name</replaceable> { SET GENERATED { ALWAYS |
>> BY DEFAULT } | SET <replaceable>sequence_option</replaceable> | RESET
>> } [...]
>>
>> but the command fails:
>> test=# ALTER TABLE idnt ALTER COLUMN i RESET;
>> ERROR: syntax error at or near ";"
>> LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET;
>
> This works for me. Check again please.
>
>> 7. (the same line of the doc)
>> usually ellipsis appears in inside parenthesis with clauses can be
>> repeated, in other words it should be written as:
>> ALTER <skipped> column_name</replaceable> ( { SET GENERATED { ALWAYS |
>> BY DEFAULT } | <skipped> } [...] )
>
> But there are no parentheses in that syntax. I think the syntax
> synopsis as written is correct.
>
>> 8. To avoid unnecessary syntax "ALTER COLUMN ... ADD GENERATED ..."
>> and use existing syntax "ALTER COLUMN ... SET GENERATED ..." you can
>> just make the "SET" word be optional as a PG's extension. I.e. instead
>> of:
>> test=# ALTER TABLE idnt ALTER i SET GENERATED ALWAYS SET INCREMENT BY 4;
>>
>> you can write:
>> test=# ALTER TABLE idnt ALTER i SET GENERATED ALWAYS INCREMENT BY 4;
>> test=# ALTER TABLE idnt ALTER n1 SET GENERATED ALWAYS INCREMENT BY 4;
>> -- which sets identity constraint - see p.5
>>
>> which is very similar to your extended syntax:
>> test=# ALTER TABLE idnt ALTER n1 ADD GENERATED ALWAYS AS IDENTITY
>> (INCREMENT BY 4);
>
> See under 5 that that is not correct.
>
>> 9. The command fails:
>> test=# ALTER TABLE idnt ALTER n2 ADD GENERATED ALWAYS AS IDENTITY;
>> ERROR: column "n2" of relation "idnt" must be declared NOT NULL
>> before identity can be added
>>
>> whereas the Spec does not contains a requirement for a column to be a
>> NOT NULLABLE.
>> You can implicitly set a column as NOT NULL (as the "serial" macros
>> does), but not require it later.
>
> The spec requires that an identity column is NOT NULL.
>
>> Moreover you can get a NULLABLE identity column by:
>> test=# ALTER TABLE idnt ALTER COLUMN i DROP NOT NULL;
>> ALTER TABLE
>> test=# \d idnt
>> Table "public.idnt"
>> Column | Type | Collation | Nullable | Default
>> --------+---------+-----------+----------+------------------------------
>> i | integer | | | generated always as identity
>> n1 | integer | | not null |
>> n2 | integer | | |
>
> Fixed by disallowing that command (similar to dropping NOT NULL from a
> primary key, for example).
>
>> 10. Inherited tables are not allowed at all
>
> fixed
>
>
>> 11. The last CF added partitioned tables which are similar to
>> inherited, but slightly different.
>
> fixed
>
>> 12. You've introduced a new parameter for a sequence declaration
>> ("SEQUENCE NAME") which is not mentioned in the docs and not
>> supported:
>> a2=# CREATE SEQUENCE s SEQUENCE NAME s;
>> ERROR: option "sequence_name" not recognized
>>
>> I think the error should look as:
>> a2=# CREATE SEQUENCE s SEQUENCE__ NAME s;
>> ERROR: syntax error at or near "SEQUENCE__"
>> LINE 1: CREATE SEQUENCE s SEQUENCE__ NAME s;
>> ^
>
> Fixed by improving message.
>
>> 13. Also strange error message:
>> test=# CREATE SCHEMA sch;
>> test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
>> IDENTITY (SEQUENCE NAME sch.s START 1);
>> ERROR: relation "sch.idnt" does not exist
>>
>> But if a table sch.idnt exists, it leads to a silent inconsistency:
>> test=# CREATE TABLE sch.idnt(n1 int);
>> CREATE TABLE
>> test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
>> IDENTITY (SEQUENCE NAME sch.s START 1);
>> ALTER TABLE
>> test=# DROP TABLE sch.idnt;
>> ERROR: could not find tuple for attrdef 0
>
> I can't reproduce this. Can you give me a complete command sequence
> from scratch?
>
>> 14. Wrong hint message:
>> test=# DROP SEQUENCE idnt_i_seq;
>> ERROR: cannot drop sequence idnt_i_seq because default for table idnt
>> column i requires it
>> HINT: You can drop default for table idnt column i instead.
>>
>> but according to DDL there is no "default" which can be dropped:
>> test=# ALTER TABLE idnt ALTER COLUMN i DROP DEFAULT;
>> ERROR: column "i" of relation "idnt" is an identity column
>
> fixed (added errhint)
>
>> 15. And finally. The command:
>> test=# CREATE TABLE t(i int GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME s));
>>
>> leads to a core dump.
>> It happens when no sequence parameter (like "START") is set.
>
> fixed
>
>
>
>

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-03-15 17:20:12 Re: logical replication access control patches
Previous Message Robert Haas 2017-03-15 17:01:30 Re: [POC] hash partitioning