Re: Dynamic data model, locks and performance

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Pierre Thibault <pierre(dot)thibault1(at)gmail(dot)com>, GENERAL <pgsql-general(at)postgresql(dot)org>
Subject: Re: Dynamic data model, locks and performance
Date: 2010-07-29 15:22:09
Message-ID: 4C519CA1.4070006@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 29/07/10 22:36, Pierre Thibault wrote:

> Why so? This is something expected by a database used in a constant
> integration environment. Maybe I did not expressed myself very well. Users
> are not changing their models all the time. They create new models which
> mean create new tables and from time to time they add new fields to existing
> tables just like developers do when they update a schema for new application
> functionalities.

Ah, OK. I read your initial mail as implying much more frequent changes,
especially combined with "millions" of tables.

> In my last job, I was working with constant database integration. We were
> created DDL scripts to add new fields and tables live on a SQLSever database
> in production. Most scripts were executed during to night to reduce the
> impact on the db. In practice, this may means that a running query will have
> to wait maybe half a second to get a lock because of such update. Usually,
> not really more than that. Can I expect similar performance with
> Postgressql?

With a few caveats, yes.

The main one: For columns you want to be NOT NULL, you should add new
columns as nullable. Then UPDATE the new column to hold any desired
default, before issuing an

ALTER TABLE ... ALTER COLUMN ... SET NOT NULL.

That's because an ALTER TABLE to add a nullable column doesn't have to
rewrite the table. An ALTER TABLE to add a NOT NULL column has to
immediately rewrite the table to add the default to every record. This
is slow, and during this operation ALTER TABLE holds an exclusive lock.

By contrast, if you ALTER TABLE to add a nullable column (brief
exclusive lock), UPDATE (long much lower-order lock that doesn't
conflict with SELECT, INSERT, or UPDATE to unaffected rows) and then
finally ALTER TABLE again to add the constraint (a further brief lock)
you have greatly reduced lock times.

>> Really, I'd be researching dynamic schema databases, object stores,
>> key/value set stores, etc. Such things are outside my personal
>> experience, though, and I can't really advise you much on technologies.
>> Beware of the latest No-SQL fad platforms, though; you might land up
>> being better off with something older and more stable even if it's less
>> fashionable than CouchDB or whatever is currently "in" today.
>>
> Maybe, but, as I said, using a SQL database with the one table by class
> hierarchy strategy seems to be the way to go for me. I'll take a lot a these
> options too.

Just beware of huge table counts. People have reported issues on the
list with truly huge numbers of tables. It's not something that turns up
in most regular relational designs, and there are a few things in Pg
(like, AFAIK, autovacuum's scanning of tables to vacuum) that scale
linearly with table counts.

I'm sure it's workable, it just might not be ideal.

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2010-07-29 15:47:53 Re: Which CMS/Ecommerce/Shopping cart ?
Previous Message Merlin Moncure 2010-07-29 14:06:58 Re: Danger of idiomatic plpgsql loop for merging data