Re: ??: Postgresql update op is very very slow

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Holger Hoffstaette <holger(at)wizards(dot)de>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: ??: Postgresql update op is very very slow
Date: 2008-06-26 14:53:21
Message-ID: 4863AD61.3070909@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Holger Hoffstaette wrote:
> Hi -
>
> I have been following this thread and find some of the recommendations
> really surprising. I understand that MVCC necessarily creates overhead,
> in-place updates would not be safe against crashes etc. but have a hard
> time believing that this is such a huge problem for RDBMS in 2008. How do
> large databases treat mass updates? AFAIK both DB2 and Oracle use MVCC
> (maybe a different kind?) as well, but I cannot believe that large updates
> still pose such big problems.
> Are there no options (algorithms) for adaptively choosing different
> update strategies that do not incur the full MVCC overhead?
>

My opinion:

Any system that provides cheap UPDATE operations is either not ACID
compliant, or is not designed for highly concurrent access, possibly
both. By ACID compliant I mean that there both the OLD and NEW need to
take space on the hard disk in order to guarantee that if a failure
occurs in the middle of the transaction, one can select only the OLD
versions for future transactions, or if it fails after the end fo the
transaction, one can select only the NEW versions for future
transactions. If both must be on disk, it follows that updates are
expensive. Even with Oracle rollback segments - the rollback segments
need to be written. Perhaps they will be more sequential, and able to be
written more efficiently, but the data still needs to be written. The
other option is to make sure that only one person is doing updates at a
time, and in this case it becomes possible (although not necessarily
safe unless one implements the ACID compliant behaviour described in the
previous point) for one operation to complete before the next begins.

The HOT changes introduced recently into PostgreSQL should reduce the
cost of updates in many cases (but not all - I imagine that updating ALL
rows is still expensive).

There is a third system I can think of, but I think it's more
theoretical than practical. That is, remember the list of changes to
each row/column and "replay" them on query. The database isn't ever
stored in a built state, but is only kept as pointers that allow any
part of the table to be re-built on access. The UPDATE statement could
be recorded cheaply, but queries against the UPDATE statement might be
very expensive. :-)

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-06-26 15:04:21 Re: 答复: [PERFORM] Postgresql update op is very very slow
Previous Message Andrew Sullivan 2008-06-26 13:55:01 Re: ??: Postgresql update op is very very slow