Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

From: Greg Stark <stark(at)mit(dot)edu>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Mel Gorman <mgorman(at)suse(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Trond Myklebust <trondmy(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Jonathan Corbet <corbet(at)lwn(dot)net>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, James Bottomley <James(dot)Bottomley(at)hansenpartnership(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)
Date: 2014-01-18 00:18:49
Message-ID: CAM-w4HPDmGQWgqsxu1Sm3ph_3U3m7wP0Zx9ksuHUqFKa0ePPgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 17, 2014 at 9:14 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> The scheme that'd allow us is the following:
> When postgres reads a data page, it will continue to first look up the
> page in its shared buffers, if it's not there, it will perform a page
> cache backed read, but instruct that read to immediately remove from the
> page cache afterwards (new API or, posix_fadvise() or whatever). As long
> as it's in shared_buffers, postgres will not need to issue new reads, so
> there's no no benefit keeping it in the page cache.
> If the page is dirtied, it will be written out normally telling the
> kernel to forget about the caching the page (using 3) or possibly direct
> io).
> When a page in postgres's buffers (which wouldn't be set to very large
> values) isn't needed anymore and *not* dirty, it will seed the kernel
> page cache with the current data.

This seems like mostly an exact reimplementation of DIRECT_IO
semantics using posix_fadvise.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-01-18 00:28:19 Re: Add %z support to elog/ereport?
Previous Message Tom Lane 2014-01-18 00:01:30 Re: [PATCH] Negative Transition Aggregate Functions (WIP)