Re: O_DIRECT for relations and SLRUs (Prototype)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: O_DIRECT for relations and SLRUs (Prototype)
Date: 2019-01-16 16:16:51
Message-ID: CA+TgmoaF4gN3LmOSK+sa31G+7psS4-2HTbwN6eEbMXM5QkjPSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 12, 2019 at 4:36 PM Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> 1. We need a new "bgreader" process to do read-ahead. I think you'd
> want a way to tell it with explicit hints (for example, perhaps
> sequential scans would advertise that they're reading sequentially so
> that it starts to slurp future blocks into the buffer pool, and
> streaming replicas might look ahead in the WAL and tell it what's
> coming). In theory this might be better than the heuristics OSes use
> to guess our access pattern and pre-fetch into the page cache, since
> we have better information (and of course we're skipping a buffer
> layer).

Right, like if we're reading the end of relation file 16384, we can
prefetch the beginning of 16384.1, but the OS won't know to do that.

> 2. We need a new kind of bgwriter/syncer that aggressively creates
> clean pages so that foreground processes rarely have to evict (since
> that is now super slow), but also efficiently finds ranges of dirty
> blocks that it can write in big sequential chunks.

Yeah.

> 3. We probably want SLRUs to use the main buffer pool, instead of
> their own mini-pools, so they can benefit from the above.

Right. I think this is important, and it makes me think that maybe
Michael's patch won't help us much in the end. I believe that the
number of pages that are needed for clog data, at least, can very
significantly depending on workload and machine size, so there's not
one number there that is going to work for everybody, and the
algorithms the SLRU code uses for page management have O(n) stuff in
them, so they don't scale well to large numbers of SLRU buffers
anyway. I think we should try to unify the SLRU stuff with
shared_buffers, and then have a test patch like Michael's (not for
commit) which we can use to see the impact of that, and then try to
reduce that impact with the stuff you mention under #1 and #2.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-01-16 16:17:43 Re: additional foreign key test coverage
Previous Message Tom Lane 2019-01-16 16:14:49 Re: draft patch for strtof()