Re: really lazy vacuums?

From: Jesper Krogh <jesper(at)krogh(dot)cc>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jim Nasby <jim(at)nasby(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: really lazy vacuums?
Date: 2011-03-17 08:17:39
Message-ID: 4D81C3A3.30801@krogh.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas wrote:
> Right. Really-lazy vacuum could freeze tuples. Unlike regular
> vacuum, it can also sensibly be done incrementally. One thing I was
> thinking about is counting the number of times that we fetched a tuple
> that was older than RecentGlobalXmin and had a committed xmin and an
> invalid xmax, but where the page was not PD_ALL_VISIBLE. If that's
> happening a lot, it probably means that some vacuuming would speed
> things up, by getting those PD_ALL_VISIBLE bits set. Perhaps you
> could work out some formula where you do a variable amount of
> super-lazy vacuuming depending on the number of such tuple fetches.
> The trick would be to avoid overdoing it (so that you swamp the I/O
> system) or underdoing it (so that the system never converges). It
> would be really nice (for this and for other things) if we had some
> way of measuring the I/O saturation of the system, so that we could
> automatically adjust the aggressiveness of background processes
> accordingly.
>
> Note also that if and when we get index-only scans, making sure the
> PD_ALL_VISIBLE bits (and thus the visibility map bits) actually get
> set is going to be a lot more important.
>

Is it obvious that the visibillity map bits should track complete
pages and not individual tuples? If the visibillity map tracks at
page-level the benefit would fall on "slim tables" where you squeeze
200 tuples into each page and having an update rate of 1% would
lower the likelyhood even more. (it may be that for slim tables the
index-only-scans are not as benefitial as to wide tables).

In collaboration with a vacuuming discussion, I dont know if it
is there allready but how about "opportunistic vacuuming". Say
you have a page what due to changes in one of the tuples are
being written out, will it, while being written out anyway get the
other tuples on the page vacuumed?

It actually dont have to hook into the process directly to benefit
the IO-usage, if it just can get the opportunity to do it before
the page gets evicted from the OS-cache, then it would save a
second read on that page, but it seems way harder to do something
sane around that assumption.

Really lazy vacuums would "only" benefit "really static tables" ? where
vacuuming is not that big a problem in the first place.

--
Jesper - Demonstrating totally lack of insight I would assume.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2011-03-17 12:24:43 Re: Sync Rep and shutdown Re: Sync Rep v19
Previous Message Martijn van Oosterhout 2011-03-17 08:17:38 Re: volatile markings to silence compilers