Re: MVCC performance issue

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Kyriacos Kyriacou <kyriacosk(at)prime-tel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: MVCC performance issue
Date: 2010-11-13 05:53:27
Message-ID: 4CDE27D7.7070706@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 11/12/2010 02:25 AM, Kyriacos Kyriacou wrote:

> The
> result is to have huge fragmentation on table space, unnecessary updates
> in all affected indexes, unnecessary costly I/O operations, poor
> performance on SELECT that retrieves big record sets (i.e. reports etc)
> and slower updates.

Yep. It's all about trade-offs. For some workloads the in-table MVCC
storage setup works pretty darn poorly, but for most it seems to work
quite well.

There are various other methods of implementing relational storage with
ACID properties. You can exclude all other transactions while making a
change to a table, ensuring that nobody else can see "old" or "new" rows
so there's no need to keep them around. You can use an out-of-line redo
log (a-la Oracle). Many other methods exist, too.

They all have advantages and disadvantages for different workloads. It's
far from trivial to mix multiple schemes within a single database, so
mixing and matching schemes for different parts of your DB isn't
generally practical.

> 1) When a raw UPDATE is performed, store all "new raw versions" either
> in separate temporary table space
> or in a reserved space at the end of each table (can be allocated
> dynamically) etc

OK, so you want a redo log a-la Oracle?

> 2) Any SELECT queries within the same session will be again accessing
> the new version of the row
> 3) Any SELECT queries from other users will still be accessing the old
> version

... and incurring horrible random I/O penalties if the redo log doesn't
fit in RAM. Again, a-la Oracle.

Even read-only transactions have to hit the undo log if there's an
update in progress, because rows they need may have been moved out to
the undo log as they're updated in the main table storage.

[snip description]

> I understand that my suggestion seems to be too simplified and also that
> there are many implementation details and difficulties that I am not
> aware.

It sounds like you're describing Oracle-style MVCC, using redo logs.

http://blogs.sybase.com/database/2009/04/mvcc-dispelling-some-oracle-fudunderstanding-the-cost/

http://en.wikipedia.org/wiki/Multiversion_concurrency_control

Oracle's MVCC approach has its own costs. Like Pg's, those costs
increase with update/delete frequency. Instead of table bloat, Oracle
suffers from redo log growth (or redo log size management issues).
Instead of increased table scan costs from dead rows, Oracle suffers
from random I/O costs as it looks up the out-of-line redo log for old
rows. Instead of long-running writer transactions causing table bloat,
Oracle can have problems with long-running reader transactions aborting
when the redo log runs out of space.

Personally, I don't know enough to know which is "better". I suspect
they're just different, with different trade-offs. If redo logs allow
you to do without write-ahead logging, that'd be interesting - but
then, the WAL is useful for all sorts of replication options, and the
use of linear WALs means that write ordering in the tables doesn't need
to be as strict, which has performance advantages.

--
Craig Ringer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Rich 2010-11-13 07:05:45 Re: MVCC performance issue
Previous Message Cédric Villemain 2010-11-13 05:44:25 Re: anti-join chosen even when slower than old plan