Re: posix_fadvise() and pg_receivexlog

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Mitsumasa KONDO <kondo(dot)mitsumasa(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: posix_fadvise() and pg_receivexlog
Date: 2014-08-07 08:02:31
Message-ID: 53E33297.2020706@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 08/07/2014 10:10 AM, Mitsumasa KONDO wrote:
> 2014-08-07 13:47 GMT+09:00 Fujii Masao <masao(dot)fujii(at)gmail(dot)com>:
>
>> On Thu, Aug 7, 2014 at 3:59 AM, Heikki Linnakangas
>> <hlinnakangas(at)vmware(dot)com> wrote:
>>> On 08/06/2014 08:39 PM, Fujii Masao wrote:
>>>> The WAL files that pg_receivexlog writes will not be re-read soon
>>>> basically,
>>>> so we can advise the OS to release any cached pages when WAL file is
>>>> closed. I feel inclined to change pg_receivexlog that way. Thought?
>>>
>>>
>>> -1. The OS should be smart enough to not thrash the cache by files that
>> are
>>> written sequentially and never read.
>>
> OS's buffer strategy is optimized for general situation. Do you forget OS
> hackers discussion last a half of year?
>
>> Yep, the OS should be so smart, but I'm not sure if it actually is. Maybe
>> not,
>> so I was thinking that posix_fadvise is called when the server closes WAL
>> file.
>
> That's right.

Well, I'd like to hear someone from the field complaining that
pg_receivexlog is thrashing the cache and thus reducing the performance
of some other process. Or a least a synthetic test case that
demonstrates that happening.

> By the way, does pg_receivexlog process have fsync() in every WAL commit?

It fsync's each file after finishing to write it. Ie. each WAL file is
fsync'd once.

> If yes, I think that we need no or less fsync() option for the better
> performance. It is general in NOSQL storages.
> If no, we need fsync() option for more getting reliability and data
> integrarity.

Hmm. An fsync=off style option might make sense, although I doubt the
one fsync at end of file is causing a performance problem for anyone in
practice. Haven't heard any complaints, anyway.

An option to fsync after every commit record might make sense if you use
pg_receivexlog with synchronous replication. Doing that would require
parsing the WAL, though, to see where the commit records are. But then
again, the fsync's wouldn't need to correspond to commit records. We
could fsync just before we go to sleep to wait for more WAL to be received.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-08-07 08:09:22 Re: B-Tree support function number 3 (strxfrm() optimization)
Previous Message Fabien COELHO 2014-08-07 07:18:16 Re: A worst case for qsort