Re: Vacuum thoughts

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Vacuum thoughts
Date: 2003-10-27 20:31:39
Message-ID: 3F9D80AB.7070103@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:

> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>> What happens instead is that vacuum not only evicts the whole buffer
>> cache by forcing all blocks of said table and its indexes in, it also
>> dirties a substantial amount of that and leaves the dirt to be cleaned
>> up by all the other backends.
>
> [ thinks about that... ] Yeah, I believe you're right, because (plain)
> vacuum just does WriteBuffer() for any page that it modifies, which only
> marks the page dirty in buffer cache. It never does anything to force
> those pages to be written out to the kernel. So, if you have a large
> buffer cache, a lot of write work will be left over to be picked up by
> other backends.
>
> I think that pre-WAL the system used to handle this stuff differently,
> in a way that made it more likely that VACUUM would issue its own
> writes. But optimizations intended to improve the behavior for
> non-VACUUM cases have made this not so good for VACUUM.
>
> I like your idea of penalizing VACUUM-read blocks when they go back into
> the freelist. This seems only a partial solution though, since it
> doesn't directly ensure that VACUUM rather than some other process will
> issue the write kernel call for the dirtied page. Maybe we should
> resurrect a version of WriteBuffer() that forces an immediate kernel
> write, and use that in VACUUM.
>
> Also, we probably need something similar for seqscan-read blocks, but
> with an intermediate priority (can we insert them to the middle of the
> freelist?)

Well, "partial solution" isn't quite what I would call it, and it surely
needs integration with sequential scans. I really do expect the whole
hack to fall apart if some concurrent seqscans are going on since it not
really penalizes the VACUUM-read blocks but more the next caller of
GetFreeBuffer(). In my test case that just happens to be VACUUM most of
the time. I described it only to demonstrate the existence of potential.

Since the whole point of the buffer cache is to avoid the real bad
thing, I/O, I don't think that the trivial double-linked list that
implements it today is adequate.

I can't imagine it completely yet, but what I see vaguely is a cache
policy that put's a block into the freelist depending on where it was
coming from (cache, seqscan, indexscan, vacuum) and what it is (heap,
toast, index). That plus the possibility for vacuum to cause it to be
written to kernel immediately might do it.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2003-10-27 20:48:15 Re: pg_user
Previous Message strk 2003-10-27 20:29:41 Re: DETOASTing in custom memory context