Re: O_DIRECT in freebsd

From: Manfred Spraul <manfred(at)colorfullife(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: O_DIRECT in freebsd
Date: 2003-10-30 17:20:21
Message-ID: 3FA14855.5020103@colorfullife.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark wrote:

>Manfred Spraul <manfred(at)colorfullife(dot)com> writes:
>
>
>
>>One problem for WAL is that O_DIRECT would disable the write cache -
>>each operation would block until the data arrived on disk, and that might block
>>other backends that try to access WALWriteLock.
>>Perhaps a dedicated backend that does the writeback could fix that.
>>
>>
>
>aio seems a better fit.
>
>
>
>>Has anyone tried to use posix_fadvise for the wal logs?
>>http://www.opengroup.org/onlinepubs/007904975/functions/posix_fadvise.html
>>
>>Linux supports posix_fadvise, it seems to be part of xopen2k.
>>
>>
>
>Odd, I don't see it anywhere in the kernel. I don't know what syscall it's
>using to do this tweaking.
>
>
At least in 2.6: linux/mm/fadvise.c, the syscall is fadvise64 or 64_64

>This is the only option that seems useful for postgres for both the WAL and
>vacuum (though in other threads it seems the problems with vacuum lie
>elsewhere):
>
> POSIX_FADV_DONTNEED attempts to free cached pages associated with the
> specified region. This is useful, for example, while streaming large
> files. A program may periodically request the kernel to free cached
> data that has already been used, so that more useful cached pages are
> not discarded instead.
>
> Pages that have not yet been written out will be unaffected, so if the
> application wishes to guarantee that pages will be released, it should
> call fsync or fdatasync first.
>
>
I agree. Either immediately after each flush syscall, or just before
closing a log file and switching to the next.

>Perhaps POSIX_FADV_RANDOM and POSIX_FADV_SEQUENTIAL could be useful in a
>backend before starting a sequential scan or index scan, but I kind of doubt
>it.
>
>
IIRC the recommendation is ~20% total memory for the postgres user space
buffers. That's quite a lot - it might be sufficient to protect that
cache from vacuum or sequential scans. AddBufferToFreeList already
contains a comment that this is the right place to try buffer
replacement strategies.

--
Manfred

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2003-10-30 17:29:42 Re: Bug in Rule+Foreing key constrain?
Previous Message ohp 2003-10-30 16:33:00 Please help