Re: Lazy constraints / defaults

From: Michał Zaborowski <michal(dot)zaborowski(at)gmail(dot)com>
To: "Martijn van Oosterhout" <kleptog(at)svana(dot)org>
Cc: "Dawid Kuroczko" <qnex42(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Lazy constraints / defaults
Date: 2008-03-10 12:31:20
Message-ID: e2289d9e0803100531g1dc3d99fob93bd8a4b55b09d8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,
Let me try again...

Here is simple example.
To do:
alter table users add column aaaa integer not null default 0;
Table is big, updated, referenced etc (big - means that alter lock the
table long enought
to kill the system). Note that it is not my design - I have to do
alter the table... but

Solution:
1. alter table users add column aaaa integer; -- short lock
2. alter table users alter column aaaa set default 0;
3. update users set aaaa = 0 where users.id between a and b; --
preparing for constraint - in small chunks
4. update users set aaaa = 0 where aaaa is null;
5. alter table users alter column aaaa set not null;

Works, but I hate it.
I would like to do:
alter table users add column aaaa integer not null default 0;
- with something like "concurrently" or "no check" - and let PG to do
the job. In that case I expect
PG to update meta data, and for updated rows set default - in other
case they can not satisfy check.
It would be great that step 3 has been done, but I understand it can
be a problem. I see that breaking
operation integrity is needed. I have a script with some parameters
that do it almost automatically.
What I want to point is that PG becomes more and more popular. People
use it for bigger and bigger
databases. In that case typical alter can be a PITA. If something can
be done by DB, I would like it
to be done in this way - as safer and faster way. In this particular
case - I expect DB to take care about
new and updated data. Correcting older rows is nice to have. That
parameter can be stored to inform
everybody - that some data may not satisfy check or null can be found
instead of default.

Look at commit_delay / commit_siblings. System is faster, but if
something go wrong - something (else)
will be lost. It is DBA decision what to choose. If DB keeps all
information in pg_class, pg_attribute
everyone can get how the changes ware made.

--
Regards,
Michał Zaborowski (TeXXaS)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-03-10 12:46:34 Re: Maximum statistics target
Previous Message Guillaume Smet 2008-03-10 12:28:59 Re: Maximum statistics target