Re: Experimental ARC implementation

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Zeugswetter Andreas SB SD <ZeugswetterA(at)spardat(dot)at>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Experimental ARC implementation
Date: 2003-11-07 15:45:50
Message-ID: 3FABBE2E.5060902@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Bruce Momjian wrote:

> Jan Wieck wrote:
>> If the system is write-bound, the checkpointer will find that many dirty
>> blocks that he has no time to nap and will burst them out as fast as
>> possible anyway. Well, at least that's the theory.
>>
>> PostgreSQL with the non-overwriting storage concept can never have
>> hot-written pages for a long time anyway, can it? They fill up and cool
>> down until vacuum.
>
> Another idea on removing sync() --- if we are going to use fsync() on
> each file during checkpoint (open, fsync, close), seems we could keep a
> hash of written block dbid/relfilenode pairs and cycle through that on
> checkpoint. We could keep the hash in shared memory, and dump it to a
> backing store when it gets full, or just have it exist in buffer writer
> process memory (so it can grow) and have backends that do their own
> buffer writes all open a single file in append mode and write the pairs
> to the file, or something like that, and the checkpoint process can read
> from there.
>

I am not really aiming at removing sync() alltogether. We know already
that open,fsync,close does not guarantee you flush dirty OS-buffers for
which another process might so far only have done open,write. And you
really don't want to remove all the vfd logic or fsync on every write
done by a backend.

What doing frequent fdatasync/fsync during a constant ongoing checkpoint
will cause is to significantly lower the physical write storm happening
at the sync(), which is causing huge problems right now.

The reason why people blame vacuum that much is that not only does it
replaces the buffer cache with useless garbage, it also leaves that
garbage to be flushed by other backends or the checkpointer and it
rapidly fills WAL, causing exactly that checkpoint we don't have the IO
bandwidth for right now! They only see that vacuum is running, and if
they kill it the system returns to a healty state after a while ... easy
enought but only half the story.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2003-11-07 15:56:02 Re: Performance features the 4th
Previous Message Alvaro Herrera 2003-11-07 15:44:01 Re: [HACKERS] Changes to Contributor List