Re: identity columns

From: Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: identity columns
Date: 2017-01-05 00:34:15
Message-ID: CAKOSWNmsci7jZzpoqWAH_QcVjQzwOMf04o91Jf5WVTkHbDMy2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, Peter,

I apologize for a silence since the last CF.
I've tested your last patch and have several nitpickings:

1. The fact COPY ignores GENERATED ALWAYS constraint (treats as
GENERATED BY DEFAULT) should be mentioned as well as rules.

2. Usually error message for identical columns (with LIKE clause)
looks like this:
test=# CREATE TABLE def(i serial, n1 int NOT NULL, n2 int);
CREATE TABLE
test=# CREATE TABLE test(i serial, LIKE def INCLUDING ALL);
ERROR: column "i" specified more than once

but for generated columns it is disorienting enough:
test=# CREATE TABLE idnt(i int GENERATED BY DEFAULT AS IDENTITY);
CREATE TABLE
test=# CREATE TABLE test(i int GENERATED BY DEFAULT AS IDENTITY, LIKE
idnt INCLUDING ALL);
ERROR: relation "test_i_seq" already exists

3. Strange error (not about absence of a column; but see pp.5 and 8):
test=# ALTER TABLE idnt ALTER COLUMN o ADD GENERATED ALWAYS AS IDENTITY;
ERROR: identity column type must be smallint, integer, or bigint

4. Altering type leads to an error:
test=# ALTER TABLE idnt ALTER COLUMN i TYPE bigint;
ERROR: cannot alter data type of identity column "i"

it is understandable for non-integers. But why do you forbid changing
to the supported types?

5. Attempt to make a column be identity fails:
test=# ALTER TABLE idnt ALTER COLUMN n1 SET GENERATED BY DEFAULT;
ERROR: column "n1" of relation "idnt" is not an identity column

As I understand from the Spec, chapter 11.20 General Rule 2)b) says
about the final state of a column without mentioning of the initial
state.
Therefore even if the column has the initial state "not generated",
after the command it changes the state to either "GENERATED ALWAYS" or
"GENERATED BY DEFAULT".

6. The docs mention a syntax:
ALTER [ COLUMN ] <replaceable
class="PARAMETER">column_name</replaceable> { SET GENERATED { ALWAYS |
BY DEFAULT } | SET <replaceable>sequence_option</replaceable> | RESET
} [...]

but the command fails:
test=# ALTER TABLE idnt ALTER COLUMN i RESET;
ERROR: syntax error at or near ";"
LINE 1: ALTER TABLE idnt ALTER COLUMN i RESET;

7. (the same line of the doc)
usually ellipsis appears in inside parenthesis with clauses can be
repeated, in other words it should be written as:
ALTER <skipped> column_name</replaceable> ( { SET GENERATED { ALWAYS |
BY DEFAULT } | <skipped> } [...] )

8. To avoid unnecessary syntax "ALTER COLUMN ... ADD GENERATED ..."
and use existing syntax "ALTER COLUMN ... SET GENERATED ..." you can
just make the "SET" word be optional as a PG's extension. I.e. instead
of:
test=# ALTER TABLE idnt ALTER i SET GENERATED ALWAYS SET INCREMENT BY 4;

you can write:
test=# ALTER TABLE idnt ALTER i SET GENERATED ALWAYS INCREMENT BY 4;
test=# ALTER TABLE idnt ALTER n1 SET GENERATED ALWAYS INCREMENT BY 4;
-- which sets identity constraint - see p.5

which is very similar to your extended syntax:
test=# ALTER TABLE idnt ALTER n1 ADD GENERATED ALWAYS AS IDENTITY
(INCREMENT BY 4);

9. The command fails:
test=# ALTER TABLE idnt ALTER n2 ADD GENERATED ALWAYS AS IDENTITY;
ERROR: column "n2" of relation "idnt" must be declared NOT NULL
before identity can be added

whereas the Spec does not contains a requirement for a column to be a
NOT NULLABLE.
You can implicitly set a column as NOT NULL (as the "serial" macros
does), but not require it later.
Moreover you can get a NULLABLE identity column by:
test=# ALTER TABLE idnt ALTER COLUMN i DROP NOT NULL;
ALTER TABLE
test=# \d idnt
Table "public.idnt"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+------------------------------
i | integer | | | generated always as identity
n1 | integer | | not null |
n2 | integer | | |

10. Inherited tables are not allowed at all (even with explicit
identity columns):
test=# CREATE TABLE inh() INHERITS (idnt);
ERROR: cannot inherit from table with identity column
test=# CREATE TABLE inh(i int GENERATED BY DEFAULT AS IDENTITY) INHERITS (idnt);
ERROR: cannot inherit from table with identity column

What's the point of forbidding tables inherited from one with identity column?
It makes identity columns be useless for big tables.
Just declare identity constraint as non-inherited (as well as some
other constraints (which identity is) - PK, FK, UNIQUE)...
Also see p.11

11. The last CF added partitioned tables which are similar to
inherited, but slightly different.
Slightly changed example from [1] (identity column added):
test=# CREATE TABLE measurement (
test(# i int GENERATED BY DEFAULT AS IDENTITY,
test(# logdate date not null,
test(# peaktemp int,
test(# unitsales int
test(# ) PARTITION BY RANGE (logdate);
CREATE TABLE
test=# CREATE TABLE measurement_y2016m07
test-# PARTITION OF measurement (
test(# unitsales DEFAULT 0
test(# ) FOR VALUES FROM ('2016-07-01') TO ('2016-08-01');
ERROR: cannot inherit from table with identity column
test=# INSERT INTO measurement(logdate) VALUES('2016-07-10');
ERROR: no partition of relation "measurement" found for row
DETAIL: Failing row contains (1, 2016-07-10, null, null).

12. You've introduced a new parameter for a sequence declaration
("SEQUENCE NAME") which is not mentioned in the docs and not
supported:
a2=# CREATE SEQUENCE s SEQUENCE NAME s;
ERROR: option "sequence_name" not recognized

I think the error should look as:
a2=# CREATE SEQUENCE s SEQUENCE__ NAME s;
ERROR: syntax error at or near "SEQUENCE__"
LINE 1: CREATE SEQUENCE s SEQUENCE__ NAME s;
^

13. Also strange error message:
test=# CREATE SCHEMA sch;
test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
IDENTITY (SEQUENCE NAME sch.s START 1);
ERROR: relation "sch.idnt" does not exist

But if a table sch.idnt exists, it leads to a silent inconsistency:
test=# CREATE TABLE sch.idnt(n1 int);
CREATE TABLE
test=# ALTER TABLE idnt ALTER COLUMN n1 ADD GENERATED BY DEFAULT AS
IDENTITY (SEQUENCE NAME sch.s START 1);
ALTER TABLE
test=# DROP TABLE sch.idnt;
ERROR: could not find tuple for attrdef 0

Also dump/restore fails for it.

Note that by default sequences have different error message:
test=# CREATE SEQUENCE sch.s1 OWNED BY idnt.n1;
ERROR: sequence must be in same schema as table it is linked to

14. Wrong hint message:
test=# DROP SEQUENCE idnt_i_seq;
ERROR: cannot drop sequence idnt_i_seq because default for table idnt
column i requires it
HINT: You can drop default for table idnt column i instead.

but according to DDL there is no "default" which can be dropped:
test=# ALTER TABLE idnt ALTER COLUMN i DROP DEFAULT;
ERROR: column "i" of relation "idnt" is an identity column

15. And finally. The command:
test=# CREATE TABLE t(i int GENERATED BY DEFAULT AS IDENTITY (SEQUENCE NAME s));

leads to a core dump.
It happens when no sequence parameter (like "START") is set.

[1]https://www.postgresql.org/docs/devel/static/sql-createtable.html

--
Best regards,
Vitaly Burovoy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2017-01-05 00:35:11 Re: proposal: session server side variables
Previous Message Tatsuo Ishii 2017-01-05 00:25:57 Questionable tag usage