Re: Fast AT ADD COLUMN with DEFAULTs

From: Pantelis Theodosiou <ypercube(at)gmail(dot)com>
To: Serge Rielau <serge(at)rielau(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Fast AT ADD COLUMN with DEFAULTs
Date: 2016-10-05 23:17:04
Message-ID: CAE3TBxwZkVMQnuwXEAEJp0dQ08UqboKb_TtheX57Vb+uOW-U8g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 6, 2016 at 12:05 AM, Serge Rielau <serge(at)rielau(dot)com> wrote:

>
> via Newton Mail
> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=9.0.74&pv=10.11.6&source=email_footer_2>
>
> On Wed, Oct 5, 2016 at 3:58 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2016-10-05 15:44:56 -0700, Jeff Janes wrote:
> >>> No, "a second “exist default"" was mentioned, i.e. it is an additional
> >>> column in a system table (pg_attribute) as default column values of
> >>> the "pre-alter" era. It solves changing of the default expression of
> >>> the same column later.
>
> > Don't think that actually solves the issue. The default might be unset
> > for a while, for example. Essentially you'd need to be able to associate
> > arbitrary number of default values with an arbitrary set of rows.
>
> I think it does work, as long as the "exists default" is immutable.
> (For safety, personally, I'd restrict it to be a verbatim constant.)
> The point is that you apply that when you are reading a row that has
> so few columns that it must predate the original ALTER TABLE ADD COLUMN.
> Subsequent ADD/DROP/SET DEFAULT affect the "active default" and hence
> insertions that happen after them, but they don't affect the
> interpretation of old rows. And of course all rows inserted after the
> ADD COLUMN contain explicit values of the column, so their meaning is
> unaffected in any case.
>
>
> You do need two defaults associated with a column to make this work.
> The "exists default" never changes after the column is added. But
> in principle, the "exists default" just replaces the NULL value that
> we implicitly insert now in such cases.
>
> Explained so much better than I could do it :-)
>
> I want to point out as a minor “extension” that there is no need for the
> default to be immutable. It is merely required that the default is evaluate
> at time of ADD COLUMN
> and then we remember the actual value for the exist default, rather than
> the parsed expression as we do for the “current” default.
>
>
> Need a better name for the concept, since evidently this name isn't
> conveying the idea.
>
> By all means. Got anything in mind?
>
>
>
For comparison, SQL Server's implementation. They have a similar feature
(in their Enterprise only edition).
From https://msdn.microsoft.com/en-us/library/ms190273.aspx :

Adding NOT NULL Columns as an Online Operation

Starting with SQL Server 2012 Enterprise Edition, adding a NOT NULL column
with a default value is an online operation when the default value is
a *runtime
constant*. This means that the operation is completed almost
instantaneously regardless of the number of rows in the table. This is
because the existing rows in the table are not updated during the
operation; instead, the default value is stored only in the metadata of the
table and the value is looked up as needed in queries that access these
rows. This behavior is automatic; no additional syntax is required to
implement the online operation beyond the ADD COLUMN syntax. A runtime
constant is an expression that produces the same value at runtime for each
row in the table regardless of its determinism. For example, the constant
expression "My temporary data", or the system function GETUTCDATETIME() are
runtime constants. In contrast, the functions NEWID() or NEWSEQUENTIALID()
are not runtime constants because a unique value is produced for each row
in the table. Adding a NOT NULL column with a default value that is not a
runtime constant is always performed offline and an exclusive (SCH-M) lock
is acquired for the duration of the operation.

While the existing rows reference the value stored in metadata, the default
value is stored on the row for any new rows that are inserted and do not
specify another value for the column. The default value stored in metadata
is moved to an existing row when the row is updated (even if the actual
column is not specified in the UPDATE statement), or if the table or
clustered index is rebuilt.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Vitaly Burovoy 2016-10-05 23:19:33 Re: Fast AT ADD COLUMN with DEFAULTs
Previous Message Serge Rielau 2016-10-05 23:05:22 Re: Fast AT ADD COLUMN with DEFAULTs