Re: [HACKERS] Behavior of GENERATED columns per SQL2003

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Zoltan Boszormenyi <zb(at)cybertec(dot)at>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Behavior of GENERATED columns per SQL2003
Date: 2007-05-15 02:36:14
Message-ID: 200705150236.l4F2aEX09902@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


URL added to TODO item.

---------------------------------------------------------------------------

Zoltan Boszormenyi wrote:
> Forwarded to -patches because of the attachment.
>
> -------- Eredeti ?zenet --------
> T?rgy: Re: [HACKERS] Behavior of GENERATED columns per SQL2003
> D?tum: Tue, 08 May 2007 12:38:32 +0200
> Felad?: Zoltan Boszormenyi <zb(at)cybertec(dot)at>
> C?mzett: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> CC: pgsql-hackers(at)postgreSQL(dot)org
> Hivatkoz?sok: <18812(dot)1178572575(at)sss(dot)pgh(dot)pa(dot)us>
>
>
>
> Tom Lane ?rta:
> > I've been studying the SQL spec in a bit more detail and I'm suddenly
> > thinking that we've got the behavior all wrong in the current
> > GENERATED/IDENTITY patch. In particular, it looks to me like we've
> > been implementing GENERATED ALWAYS AS (expr) according to the rules
> > that the spec in fact lays down only for GENERATED ALWAYS AS IDENTITY.
> > You'd think the two constructs would be pretty closely related but
> > the behaviors specified by the spec are light-years apart. If you
> > look closely, a "generated column" in the meaning of section 4.14.8
> > is one that has GENERATED ALWAYS AS (expr), and identity columns are
> > *not* in this class.
> >
>
> True.
>
> > It looks to me like the behavior the spec intends for a generated column
> > is actually that it can be implemented as a "virtual column" occupying
> > no space on disk and instead computed on-the-fly when retrieved.
> >
>
> I think the keyword here is 'can be'. But having it stored gives a nice
> speedup in SELECTs vs. no speedup if you generate it on the fly.
>
> > Identity columns can have their values overridden by the
> > user (it's a little harder if GENERATED ALWAYS, but still possible),
> > and they don't change during an UPDATE unless specifically forced to.
> >
>
> Yes, I implemented it this way.
>
> > In contrast, generated columns cannot be overridden by
> > assignment, and are recomputed from their base columns during updates.
> >
>
> I see, I incorrectly made OVERRIDING SYSTEM VALUE
> to have an effect on generated columns. Now I don't need
> to pass around the list of the modified fields from
> rewriteTargetList(), I simply have to blindly update all of them
> both in INSERT and UPDATE. Fixed. And now I don't have to
> invent something to discover what fields were modified by
> BEFORE triggers.
>
> > This realization also explains the following, otherwise rather strange,
> > facts:
> >
> > * There is no GENERATED BY DEFAULT AS (expr) in the spec.
> >
>
> Yes, and because it already exists and called DEFAULT.
>
> > * GENERATED expressions are specifically disallowed from containing
> > subselects, calling functions that access any SQL-data, or being
> > nondeterministic; hence their values depend solely on the regular
> > columns in the same row.
> >
>
> The sanity checks for the DEFAULT expression already
> handle subselect in PostgreSQL:
>
> db=# create table t1 (id float generated always as identity, t text,
> g text generated always as (case when t is null then '' else t end ||
> (select max(id) from t1));
> NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for
> serial column "t1.id"
> ERROR: cannot use subquery in default expression
>
> But how do you check a function in general?
> Especially when it's not written in plpgsql?
> E.g. a C function can use SPI and SELECTs.
>
> Also, SQL:2003 doesn't allow any functions for the DEFAULT clause
> besides timestamp functions, e.g. NOW().
> But PostgreSQL already allows non-IMMUTABLE functions used in
> DEFAULT clauses. Would you want to restrict it?
>
> > * While identity columns are updated (if needed) before execution of
> > BEFORE triggers, generated columns are updated after BEFORE triggers;
> > hence a BEFORE trigger can override the value in one case and not the
> > other. (The current patch gets this wrong, btw.)
> >
>
> Where do you see that? Which version were you looking at?
> Identity columns are generated in rewriteTargetList(), way before any
> triggers.
> Generated column are computed in ExecInsert() and ExecUpdate(),
> certainly after applying BEFORE triggers in both cases and
> before CheckConstraint(). There was one bug in the UPDATE case, though,
> as UPDATE loops if it couldn't do its job in one go because of
> serialization.
> I fixed this.
>
> > * Generated columns are forcibly updated when their base columns change
> > as a result of FK constraints (such as ON UPDATE CASCADE).
> >
>
> Isn't ExecUpdate() called on the referring table's row during such FK event?
> It seems yes to me:
>
> db=# create table t1 (id serial generated always as identity primary key,
> t text generated always as (id || '_1'));
> NOTICE: CREATE TABLE will create implicit sequence "t1_id_seq" for
> serial column "t1.id"
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t1_pkey"
> for table "t1"
> CREATE TABLE
> db=# create table t2 (id serial generated always as identity primary key,
> id_t1 integer not null references t1(id) on update cascade,
> g text generated always as (id_t1 || '_1'));
> NOTICE: CREATE TABLE will create implicit sequence "t2_id_seq" for
> serial column "t2.id"
> NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "t2_pkey"
> for table "t2"
> CREATE TABLE
> db=# insert into t1 values (default, 'a1');
> INSERT 0 1
> db=# insert into t1 values (default, 'b2');
> INSERT 0 1
> db=# insert into t1 values (default, 'c3');
> INSERT 0 1
> db=# insert into t1 values (default, 'd4');
> INSERT 0 1
> db=# select * from t1;
> id | t
> ----+-----
> 1 | 1_1
> 2 | 2_1
> 3 | 3_1
> 4 | 4_1
> (4 rows)
>
> db=# insert into t2 values (default, 2);
> INSERT 0 1
> db=# select * from t2;
> id | id_t1 | g
> ----+-------+-----
> 1 | 2 | 2_1
> (1 row)
>
> db=# update t1 set id = 77 where id = 2;
> UPDATE 1
> db=# select * from t1;
> id | t
> ----+------
> 1 | 1_1
> 3 | 3_1
> 4 | 4_1
> 77 | 77_1 (<--- This was buggy, the generated value didn't change here,
> as an oversight during the rewrite to
> make the generation behaviour
> a property of the DEFAULT clause. Now
> fixed.)
> (4 rows)
>
> zozo=# select * from t2;
> id | id_t1 | g
> ----+-------+------
> 1 | 77 | 77_1 (<---- It worked this way before, too.)
> (1 row)
>
> Anyway, a new patch is necessary it seems,
> so it should apply cleanly to new CVS and have the
> above bugs fixed. Attached.
>
> > It looks to me like a BEFORE trigger is actually the only place that can
> > (transiently) see values of a generated column that are different from
> > the result of applying the generation expression on the rest of the row.
> > It's unclear whether that's intentional or an oversight.
> >
>
> I thought it was intentional. BEFORE triggers can change the base columns
> before storing them, but the generated columns should be consistent
> with their generation expressions and their base columns after executing
> the BEFORE triggers. If the triggers should see generated values
> then another recomputation is necessary to refresh the generated
> columns AND to make them unchangable by the triggers.
> It seems to be a waste of cycles to me. It should be documented, though.
>
> > Is anyone familiar with a database that implements SQL-spec generated
> > columns? Do they actually store the columns?
> >
>
> David Fuhry answered this question with the advertised intention.
>
> > regards, tom lane
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: Have you checked our extensive FAQ?
> >
> > http://www.postgresql.org/docs/faq
> >
> >
>
> Best regards,
> Zolt?n B?sz?rm?nyi
>
> --
> ----------------------------------
> Zolt?n B?sz?rm?nyi
> Cybertec Geschwinde & Sch?nig GmbH
> http://www.postgresql.at/
>
>
>
>
> --
> ----------------------------------
> Zolt?n B?sz?rm?nyi
> Cybertec Geschwinde & Sch?nig GmbH
> http://www.postgresql.at/
>

[ application/x-tar is not supported, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Alvaro Herrera 2007-05-15 02:44:51 Re: [PATCHES] OS/X startup scripts
Previous Message Mark Kirkwood 2007-05-15 01:54:56 Re: [PATCHES] OS/X startup scripts