Re: WAL prefetch

From: Andres Freund <andres(at)anarazel(dot)de>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <seanc(at)joyent(dot)com>
Subject: Re: WAL prefetch
Date: 2018-06-19 15:50:51
Message-ID: 20180619155051.xo5vxtns44otbrs2@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-06-19 12:08:27 +0300, Konstantin Knizhnik wrote:
> I do not think that prefetching in shared buffers requires much more efforts
> and make patch more envasive...
> It even somehow simplify it, because there is no to maintain own cache of
> prefetched pages...

> But it will definitely have much more impact on Postgres performance:
> contention for buffer locks, throwing away pages accessed by read-only
> queries,...

These arguments seem bogus to me. Otherwise the startup process is going
to do that work.

> Also there are two points which makes prefetching into shared buffers more
> complex:
> 1. Need to spawn multiple workers to make prefetch in parallel and somehow
> distribute work between them.

I'm not even convinced that's true. It doesn't seem insane to have a
queue of, say, 128 requests that are done with posix_fadvise WILLNEED,
where the oldest requests is read into shared buffers by the
prefetcher. And then discarded from the page cache with WONTNEED. I
think we're going to want a queue that's sorted in the prefetch process
anyway, because there's a high likelihood that we'll otherwise issue
prfetch requets for the same pages over and over again.

That gets rid of most of the disadvantages: We have backpressure
(because the read into shared buffers will block if not yet ready),
we'll prevent double buffering, we'll prevent the startup process from
doing the victim buffer search.

> Concerning WAL perfetch I still have a serious doubt if it is needed at all:
> if checkpoint interval is less than size of free memory at the system, then
> redo process should not read much.

I'm confused. Didn't you propose this? FWIW, there's a significant
number of installations where people have observed this problem in
practice.

> And if checkpoint interval is much larger than OS cache (are there cases
> when it is really needed?)

Yes, there are. Percentage of FPWs can cause serious problems, as do
repeated writouts by the checkpointer.

> then quite small patch (as it seems to me now) forcing full page write
> when distance between page LSN and current WAL insertion point exceeds
> some threshold should eliminate random reads also in this case.

I'm pretty sure that that'll hurt a significant number of installations,
that set the timeout high, just so they can avoid FPWs.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2018-06-19 15:51:16 Re: Fast default stuff versus pg_upgrade
Previous Message Nico Williams 2018-06-19 15:47:24 Re: Query Rewrite for Materialized Views (Postgres Extension)