Re: Vacuum thoughts

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum thoughts
Date: 2003-10-27 19:20:35
Message-ID: 3F9D7003.1030400@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

To add some medium-hard data to the discussion, I hacked a PG 7.3.4 a
little. The system I am talking about below run's an artificial
application that very well resembles the behaviour of a TPC-C benchmark
implementation. Without vacuuming the database, it can just so sustain a
factor 5 scaled database running with 50 simulated terminals. To free
some bandwidth, the system is configured with scaling 4 and runs with 40
simulated terminals. In this configuration it can satisfy the
responsetime requirements for 100% of all transactions when not
vacuuming ... no surprise.

The test driver takes 10 minute intervals and reports the percentage of
transactions which qualify.

If the database now is vacuumed simultaneously, the response time for
transactions changes dramatically. A 10 minute interval hit by vacuum
drops down from 100% to anything below 90%, I've seen it down to 75%.
The system load given by a Linux 2.4 kernel jumps up from under 1.0 to
anything between 5 and 8.

So far, that is exactly what most DBA's are complaining about. A system
that runs smoothly otherwise get's literally bogged down by any vacuum.

Now I changed the cache policy. While a backend is running vacuum, a
global flag is set. If this flag is set and a block is not found in the
cache but must be read, it's buffer is marked BM_READ_BY_VACUUM. When
the global flag is set, AddBufferToFreelist() inserts buffers so marked
at the head of the freelist instead of adding them to the tail. In any
case, the buffers BM_READ_BY_VACUUM flag is cleared.

The effect of this simple hack is somewhat surprising. Not only can the
system keep satisfying 97% or more of all transactions within time
limits and the system load stays well below 2.0 (I've only seen 1.6
once), but very surprisingly VACUUM finishes about 20% faster too.

I'm not a friend of jumping to conclusions, OTOH I have to try to make
some sense out of it. So I would like the following be taken with a
reasonable amount of salt.

I think that the common theory, vacuum is similar to a sequential scan,
just does not hold true for any table that is actually updated randomly.
What happens instead is that vacuum not only evicts the whole buffer
cache by forcing all blocks of said table and its indexes in, it also
dirties a substantial amount of that and leaves the dirt to be cleaned
up by all the other backends.

The changes I've done above cause vacuum to work with as few shared
buffers as possible for the data not already found in the cache. This
avoids imposing unnecessary additional write overhead for regular
backends, and causes the vacuum process to stay inside of a few virtual
memory pages instead of running all over the place. I don't know how
much the latter impacts the efficiency of the MMU, it might not be
significant here.

It is well possible that there is some other side effect in the buffer
cache that impacts the behaviour of many backends doing few writes
compared to one backend doing them en-gros.

However, the test indicates that there is some low hanging fruit in the
cache algorithm, and that it's not just a few little raspberries.

Jan

Tom Lane wrote:

> Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in> writes:
>> I was thinking about it. How about vacuuming a page when it is been
>> pushed out of postgresql buffer cache? It is is memory so not much IO
>> is involved.
>
> You keep ignoring the problem of removing index entries. To vacuum an
> individual page, you need to be willing to read in (and update) all
> index pages that reference the tuples-to-be-deleted. This is hardly
> tenable when the reason for pushing the page out of buffer cache was so
> that you could read in something else instead --- you don't have spare
> buffer slots, and you don't want to do all that I/O (and the associated
> WAL log entries) before you can read in the page you originally wanted.
>
> The latter point is really the crux of the problem. The point of having
> the VACUUM process is to keep maintenance work out of the critical path
> of foreground queries. Anything that moves even part of that
> maintenance work into the critical path is going to be a net loss.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Eduardo D Piovesam 2003-10-27 19:47:04 Re: PostgreSQL on Novell Netware 6.5.
Previous Message Tom Lane 2003-10-27 19:13:11 Re: DETOASTing in custom memory context