Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Mel Gorman <mgorman(at)suse(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Hannu Krosing <hannu(at)2ndquadrant(dot)com>, Trond Myklebust <trondmy(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Jonathan Corbet <corbet(at)lwn(dot)net>, Dave Chinner <david(at)fromorbit(dot)com>, Joshua Drake <jd(at)commandprompt(dot)com>, James Bottomley <James(dot)Bottomley(at)hansenpartnership(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "lsf-pc(at)lists(dot)linux-foundation(dot)org" <lsf-pc(at)lists(dot)linux-foundation(dot)org>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)
Date: 2014-01-18 00:35:21
Message-ID: 20140118003521.GD22416@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2014-01-17 16:18:49 -0800, Greg Stark wrote:
> On Fri, Jan 17, 2014 at 9:14 AM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > The scheme that'd allow us is the following:
> > When postgres reads a data page, it will continue to first look up the
> > page in its shared buffers, if it's not there, it will perform a page
> > cache backed read, but instruct that read to immediately remove from the
> > page cache afterwards (new API or, posix_fadvise() or whatever). As long
> > as it's in shared_buffers, postgres will not need to issue new reads, so
> > there's no no benefit keeping it in the page cache.
> > If the page is dirtied, it will be written out normally telling the
> > kernel to forget about the caching the page (using 3) or possibly direct
> > io).
> > When a page in postgres's buffers (which wouldn't be set to very large
> > values) isn't needed anymore and *not* dirty, it will seed the kernel
> > page cache with the current data.
>
> This seems like mostly an exact reimplementation of DIRECT_IO
> semantics using posix_fadvise.

Not at all. The significant bits are a) the kernel uses the pagecache
for writeback of dirty data and more importantly b) uses the pagecache
to cache data not in postgres's s_b.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-01-18 00:42:35 Re: Add %z support to elog/ereport?
Previous Message Andres Freund 2014-01-18 00:33:52 Re: Add %z support to elog/ereport?