Quick Links

Re: Fast AT ADD COLUMN with DEFAULTs

From:	Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Serge Rielau <serge(at)rielau(dot)com>
Cc:	Vitaly Burovoy <vitaly(dot)burovoy(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Fast AT ADD COLUMN with DEFAULTs
Date:	2016-10-10 03:24:27
Message-ID:	15cce493-37e1-88ad-e02a-8ed9bff22a34@BlueTreble.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 10/6/16 11:01 AM, Tom Lane wrote:
> Something based on missing_value/absent_value could work for me too.
> If we name it something involving "default", that definitely increases
> the possibility for confusion with the regular user-settable default.
>
> Also worth thinking about here is that the regular default expression
> affects what will be put into future inserted rows, whereas this thing
> affects the interpretation of past rows. So it's really quite a different
> animal. That's kind of leading me away from calling it creation_default.

There's actually another use case here that's potentially extremely
valuable for warehousing and other "big data": compact representation of
a default value.

The idea here is that if you have a specific value for a field that
makes up a very large portion of your data, you'd really like to be able
to represent that value *in each row* with something like a bit (such as
we currently do for NULLs).

What I'd expect to see in the real world (once users figure this hack
out) would be:

CREATE TABLE ...(
...
-- skip field_a so we can handle all it's common values
);

INSERT INTO
...
SELECT
...
WHERE field_a IS NOT DISTINCT FROM 'really common value'
;

ALTER TABLE
ADD field_a ... NOT NULL DEFAULT 'really common value'
;

-- load rest of the data

That would have the effect of storing all those really common values
with a single bit.

What we'd ultimately want is some kind of catalog versioning so that we
knew what was in place when each tuple was created; that would allow for
changing these things over time without forcing a full rewrite.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

In response to

Re: Fast AT ADD COLUMN with DEFAULTs at 2016-10-06 16:01:12 from Tom Lane

Responses

Re: Fast AT ADD COLUMN with DEFAULTs at 2016-10-10 04:02:12 from Corey Huinker

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jim Nasby	2016-10-10 03:26:07	Re: proposal: psql \setfileref
Previous Message	Jim Nasby	2016-10-10 03:00:34	Re: pg_stat_statements and non default search_path