Re: posix_fadvsise in base backups

From: Andres Freund <andres(at)anarazel(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: posix_fadvsise in base backups
Date: 2011-09-24 16:26:10
Message-ID: 201109241826.11155.andres@anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Saturday, September 24, 2011 05:16:48 PM Magnus Hagander wrote:
> On Sat, Sep 24, 2011 at 17:14, Andres Freund <andres(at)anarazel(dot)de> wrote:
> > On Saturday, September 24, 2011 05:08:17 PM Magnus Hagander wrote:
> >> Attached patch adds a simple call to posix_fadvise with
> >> POSIX_FADV_DONTNEED on all the files being read when doing a base
> >> backup, to help the kernel not to trash the filesystem cache.
> >> Seems like a simple enough fix - in fact, I don't remember why I took
> >> it out of the original patch :O
> >> Any reason not to put this in? Is it even safe enough to put into 9.1
> >> (probably not, but maybe?)
> > Won't that possibly throw a formerly fully cached database out of the
> > cache?
> I was assuming the kernel was smart enough to read this as "*this*
> process is not going to be using this file anymore", not "nobody in
> the whole machine is going to use this file anymore". And the process
> running the base backup is certainly not going to read it again.
> But that's a good point - do you know if that is the case, or does it
> mandate more testing?
I am pretty but not totally sure that the kernel does not track each process
that uses a page. For one doing so would probably prohibitively expensive. For
another I am pretty (but not ...) sure that I restructured an application not
to fadvise(DONTNEED) memory that is also used in other processes.

Currently I can only think of to workarounds, both os specific:
- Use O_DIRECT for reading the base backup. Will be slow in fully cached
situations, but should work ok enough in all others. Need to be carefull about
the usual O_DIRECT pitfalls (pagesize, alignment etcetera).
- use mmap/mincore() to gather whether data is in cache and restore that state
afterwards.

Too bad that POSIX_FADV_NOREUSE is not really implemented.

Andres

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-09-24 16:42:54 Re: unite recovery.conf and postgresql.conf
Previous Message Kerem Kat 2011-09-24 16:24:16 Re: Adding CORRESPONDING to Set Operations