Skip site navigation (1) Skip section navigation (2)

Re: MVCC performance issue

From: Thom Brown <thom(at)linux(dot)com>
To: Kyriacos Kyriacou <kyriacosk(at)prime-tel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: MVCC performance issue
Date: 2010-11-12 13:54:57
Message-ID: AANLkTinhnXGo4SxdHN1b5twy2rguV+1An-XAeDtbnJFi@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 12 November 2010 13:47, Kyriacos Kyriacou <kyriacosk(at)prime-tel(dot)com>wrote:

> This is my first post in this mailing list and I would like to raise an
> issue that in my opinion is causing performance issues of PostgreSQL
> especially in a transaction processing environment. In my company we are
> using PostgreSQL for the last 8 year for our in-house developed billing
> system (telecom). The last few months we started considering moving to
> another RDBMS just because of this issue.
>
> After all these years, I believe that the biggest improvement that could
> be done and will boost overall performance especially for enterprise
> application will be to improve Multiversion Concurrency Control (MVCC)
> mechanism. In theory this seems to be improving performance for SELECT
> queries but on tables with very intensive and frequent updates, even
> that is not fully true because of the fragmentation of data caused by
> MVCC. I saw cases were a SELECT COUNT(*) on an empty (!!!) table (used
> as a buffer) took more than 40min to return a result! VACUUM is not a
> solution in my opinion even though after the introduction of autovacuum
> daemon situation got much better.
>
> PROBLEM DECRIPTION
> ------------------
> By definition of MVCC, when an UPDATE is performed, PostgreSQL creates a
> new copy of the row in a new location. Any SELECT queries within the
> same session are accessing the new version of the raw and all other
> queries from other users are still accessing the old version. When
> transaction is COMMIT PostgreSQL makes the a new version of the row as
> the "active" row and expires the old row that remains "dead" and then is
> up to VACUUM procedure to recover the "dead" rows space and make it
> available to the database engine. In case that transaction is ROLLBACK
> then space reserved for the new version of the row is released. The
> result is to have huge fragmentation on table space, unnecessary updates
> in all affected indexes, unnecessary costly I/O operations, poor
> performance on SELECT that retrieves big record sets (i.e. reports etc)
> and slower updates. As an example, consider updating the "live" balance
> of a customer for each phone call where the entire customer record has
> to be duplicated again and again upon each call just for modifying a
> numeric value!
>
> SUGGESTION
> --------------
> 1) When a raw UPDATE is performed, store all "new raw versions" either
> in separate temporary table space
>   or in a reserved space at the end of each table (can be allocated
> dynamically) etc
> 2) Any SELECT queries within the same session will be again accessing
> the new version of the row
> 3) Any SELECT queries from other users will still be accessing the old
> version
> 4) When UPDATE transaction is ROLLBACK just release the space used in
> new temporary location
> 5) When UPDATE transaction is COMMIT then try to LOCK the old version
> and overwrite it at the same physical location (NO FRAGMENTATION).
> 6) Similar mechanism can be applied on INSERTS and DELETES
> 7) In case that transaction was COMMIT, the temporary location can be
> either released or archived/cleaned on a pre-scheduled basis. This will
> possibly allow the introduction of a TRANSACTION LOG backup mechanism as
> a next step.
> 8) After that VACUUM will have to deal only with deletions!!!
>
>
> I understand that my suggestion seems to be too simplified and also that
> there are many implementation details and difficulties that I am not
> aware.
>
> I strongly believe that the outcome of the discussion regarding this
> issue will be helpful.
>
>
Which version of PostgreSQL are you basing this on?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

In response to

Responses

pgsql-performance by date

Next:From: Jon NelsonDate: 2010-11-12 15:33:21
Subject: Re: postmaster consuming /lots/ of memory with hash aggregate. why?
Previous:From: Kenneth MarshallDate: 2010-11-12 13:52:35
Subject: Re: MVCC performance issue

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group