Skip site navigation (1) Skip section navigation (2)

Re: ??: Postgresql update op is very very slow

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Holger Hoffstaette <holger(at)wizards(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ??: Postgresql update op is very very slow
Date: 2008-06-26 14:53:21
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Holger Hoffstaette wrote:
> Hi -
> I have been following this thread and find some of the recommendations
> really surprising. I understand that MVCC necessarily creates overhead,
> in-place updates would not be safe against crashes etc. but have a hard
> time believing that this is such a huge problem for RDBMS in 2008. How do
> large databases treat mass updates? AFAIK both DB2 and Oracle use MVCC
> (maybe a different kind?) as well, but I cannot believe that large updates
> still pose such big problems.
> Are there no options (algorithms) for adaptively choosing different
> update strategies that do not incur the full MVCC overhead?

My opinion:

Any system that provides cheap UPDATE operations is either not ACID 
compliant, or is not designed for highly concurrent access, possibly 
both. By ACID compliant I mean that there both the OLD and NEW need to 
take space on the hard disk in order to guarantee that if a failure 
occurs in the middle of the transaction, one can select only the OLD 
versions for future transactions, or if it fails after the end fo the 
transaction, one can select only the NEW versions for future 
transactions. If both must be on disk, it follows that updates are 
expensive. Even with Oracle rollback segments - the rollback segments 
need to be written. Perhaps they will be more sequential, and able to be 
written more efficiently, but the data still needs to be written. The 
other option is to make sure that only one person is doing updates at a 
time, and in this case it becomes possible (although not necessarily 
safe unless one implements the ACID compliant behaviour described in the 
previous point) for one operation to complete before the next begins.

The HOT changes introduced recently into PostgreSQL should reduce the 
cost of updates in many cases (but not all - I imagine that updating ALL 
rows is still expensive).

There is a third system I can think of, but I think it's more 
theoretical than practical. That is, remember the list of changes to 
each row/column and "replay" them on query. The database isn't ever 
stored in a built state, but is only kept as pointers that allow any 
part of the table to be re-built on access. The UPDATE statement could 
be recorded cheaply, but queries against the UPDATE statement might be 
very expensive. :-)


Mark Mielke <mark(at)mielke(dot)cc>

In response to

pgsql-performance by date

Next:From: Tom LaneDate: 2008-06-26 15:04:21
Subject: Re: 答复: [PERFORM] Postgresql update op is very very slow
Previous:From: Andrew SullivanDate: 2008-06-26 13:55:01
Subject: Re: ??: Postgresql update op is very very slow

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group