Re: vacuum, performance, and MVCC

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Csaba Nagy <nagy(at)ecircle-ag(dot)com>
Cc: Mark Woodward <pgsql(at)mohawksoft(dot)com>, PFC <lists(at)peufeu(dot)com>, Chris Browne <cbbrowne(at)acm(dot)org>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-23 13:56:27
Message-ID: 20060623135627.GE8900@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 23, 2006 at 03:30:45PM +0200, Csaba Nagy wrote:
> > It sounds like you have a "big" problem and you need a "big" solution.
>
> Well, Postgres does a decent job as it is. The problem is under peek
> load, sometimes it gets bogged down and the usual things like vacuum
> will not help immediately. I think a few more features like the dead
> space map for quick vacuum and even something like the original post's
> proposition would make postgres fly under heavy load too...

I know there have a been a number of suggestions in the past to deal
with this thing. Some I don't remember being mentioned in this thread
are:

- Once a tuple has been determined to be invisible to everyone,
truncate it to just the header. This would probably work wonders for
frequently updated wide tables. However, this required keeping track of
the oldest active xact, I'm not sure how that works at the moment.

- Have the bgwriter do cleanup work before writing out a block. It
could probably do the truncation bit above, but totally removing old
tuples requires cleaning out the indexes too, which AIUI is the hard
part of vacuuming.

One totally whacked out idea I just thought of: Instead of just
truncating tuples when they're invisible, mark them "dying" and make
the data section store an array of CTIDs pointing to the index tuples
pointing to it. Lookups that find the tuple via an index could store
the CTID of the index tuple before continuing. If the bgwriter sees it
has a full set, it can efficiently remove the tuple straight away.

There are ofcourse drawbacks to this approach, you'd probably need
something like the half-dirty pages to avoid a large increase in write
load. If it's even beneficial at all given concurrency issues.

Still, VACUUM has gotten faster in CVS so the issues are slowly being
addressed...

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-06-23 13:58:42 Re: [CORE] GPL Source and Copyright Questions
Previous Message John DeSoi 2006-06-23 13:56:04 Re: [CORE] GPL Source and Copyright Questions