Re: swapcache-style cache?

From: Jan Lentfer <Jan(dot)Lentfer(at)web(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: swapcache-style cache?
Date: 2012-02-27 20:24:11
Message-ID: 4F4BE66B.3080001@web.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Am 23.02.2012 21:57, schrieb Greg Smith:
> On 02/22/2012 05:31 PM, james wrote:
>> Has anyone considered managing a system like the DragonFLY swapcache for
>> a DBMS like PostgreSQL?
>>
>> ie where the admin can assign drives with good random read behaviour
>> (but perhaps also-ran random write) such as SSDs to provide a cache for
>> blocks that were dirtied, with async write that hopefully writes them
>> out before they are forcibly discarded.
>
> We know that battery-backed write caches are extremely effective for
> PostgreSQL writes. I see most of these tiered storage ideas as acting
> like a big one of those, which seems to hold in things like SAN storage
> that have adopted this sort of technique already. A SSD is quite large
> relative to a typical BBWC.
[...]

> -Ultimately all this data needs to make it out to real disk. The funny
> thing about caches is that no matter how big they are, you can easily
> fill them up if doing something faster than the underlying storage can
> handle.

[...]

> I don't think the idea of a swapcache is without merit; there's surely
> some applications that will benefit from it. It's got a lot of potential
> as a way to absorb short-term bursts of write activity. And there are
> some applications that could benefit from having a second tier of read
> cache, not as fast as RAM but larger and faster than real disk seeks. In
> all of those potential win cases, though, I don't see why the OS
> couldn't just manage the whole thing for us.

First off, thank's very much for mentioning DragonFly's swapcache on
this mailing list, which takes the burden off me/us to self-advertise
this feature :)

But swapcache is clearly not meant or designed to speed up any write
activity by caching writes and delaying the write to the "target
storage" to a later point in time. Swapcache does not affect writes in
any way, actually.
Swapcache does its writing when a clean VM page hits the inactive VM
page queue. VM pages related to filesystem writes are dirty, the write
occurs normally, then they become clean. But they still have to cycle
into the VM page inactive queue before swapcache will touch them (write
them out to swap).

So, basically it is designed to speed up Metadata reads, and if
configured to do so, data reads.

So, it can take some read load burden of the disk subsystem and free the
disk subsystem for more write activity, but that would be just a side
effect, not a design goal.

And, yes.. it does effect pgsql performance on read loads seriously.

See BSD Mag 5/2011
http://bsdmag.org/magazine/1691-embedded-bsd-freebsd-alix

and
http://www.shiningsilence.com/dbsdlog/2011/04/12/7586.html

Jan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-02-27 20:25:46 Re: pgstat documentation tables
Previous Message David E. Wheeler 2012-02-27 19:44:03 Re: overriding current_timestamp