Re: MVCC performance issue

From: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To: Kyriacos Kyriacou <kyriacosk(at)prime-tel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: MVCC performance issue
Date: 2010-11-12 17:39:55
Message-ID: AANLkTim_RdNHfrMoHk-KchU=54CZXUcvpYCRDo480Y_+@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

OK, in general you have to pay for MVCC one way or another. Many
databases make you pay as you go, so to speak, by storing all the MVCC
info in a log to be applied at some future date. Other databases you
can pay later, by storing all the MVCC in the table itself. Both have
similar costs, but one can punish you harshly if you let the MVCC data
stored in the database get out of hand.

8.3 and above are much more aggresive about autovacuuming, and on
bigger hardware you can make it VERY aggressive and keep the bloat out
while keeping up good throughput. On some servers I set up 4 or 6 or
8 autovacuum threads to keep up. If you were on another db you
might be adding more drives to make some other part faster.

For batch processing storing all MVCC data in the data store can be
problematic, but for more normal work where you're changing <1% of a
table all the time it can be very fast.

Some other databases will just run out of space to store transactions
and roll back everything you've done. PostgreSQL will gladly let you
shoot yourself in the foot with bloating the data store by running
successive whole table updates without vacuuming in between.

Bottom line, if your hardware can't keep up, it can't keep up. If
vacuum capsizes your IO and still can't keep up then you need more
disks and / or better storage subsystems. A 32 disk array with single
controller goes for ~$7 to $10k, and you can sustain some pretty
amazing thgouhput on that kind of IO subsystem.

If you're doing batch processing you can get a lot return by just
making sure you vacuum after each mass update. Especially if you are
on a single use machine with no cost delays for vacuum, running a
vacuum on a freshly worked table should be pretty fast.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-11-12 17:48:39 Re: MVCC performance issue
Previous Message Scott Marlowe 2010-11-12 17:27:12 Re: MVCC performance issue