Re: Direct I/O

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Christoph Berg <myon(at)debian(dot)org>, mikael(dot)kjellstrom(at)gmail(dot)com, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Direct I/O
Date: 2023-04-19 14:11:32
Message-ID: CA+TgmoY+xxqF0TUdEhkGgMqA=OKUjZDHmzLtpyJjFea-VBf9Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 18, 2023 at 3:35 PM Greg Stark <stark(at)mit(dot)edu> wrote:
> Well.... I'm more optimistic... That may not always be impossible.
> We've already added the ability to add more shared memory after
> startup. We could implement the ability to add or remove shared buffer
> segments after startup. And it wouldn't be crazy to imagine a kernel
> interface that lets us judge whether the kernel memory pressure makes
> it reasonable for us to take more shared buffers or makes it necessary
> to release shared memory to the kernel.

On this point specifically, one fairly large problem that we have
currently is that our buffer replacement algorithm is terrible. In
workloads I've examined, either almost all buffers end up with a usage
count of 5 or almost all buffers end up with a usage count of 0 or 1.
Either way, we lose all or nearly all information about which buffers
are actually hot, and we are not especially unlikely to evict some
extremely hot buffer. This is quite bad for performance as it is, and
it would be a lot worse if recovering from a bad eviction decision
routinely required rereading from disk instead of only rereading from
the OS buffer cache.

I've sometimes wondered whether our current algorithm is just a more
expensive version of random eviction. I suspect that's a bit too
pessimistic, but I don't really know.

I'm not saying that it isn't possible to fix this. I bet it is, and I
hope someone does. I'm just making the point that even if we knew the
amount of kernel memory pressure and even if we also had the ability
to add and remove shared_buffers at will, it probably wouldn't help
much as things stand today, because we're not in a good position to
judge how large the cache would need to be in order to be useful, or
what we ought to be storing in it.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2023-04-19 14:24:59 Re: Direct I/O
Previous Message Tom Lane 2023-04-19 13:58:07 Re: Remove references to pre-11 versions