Re: Large tables (was: RAID 0 not as fast as

From: mark(at)mark(dot)mielke(dot)cc
To: Guy Thornley <guy(at)esphion(dot)com>
Cc: Markus Schaber <schabi(at)logix-tt(dot)com>, Luke Lonergan <llonergan(at)greenplum(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large tables (was: RAID 0 not as fast as
Date: 2006-09-22 03:40:37
Message-ID: 20060922034037.GB2820@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Sep 22, 2006 at 02:52:09PM +1200, Guy Thornley wrote:
> > >> I thought that posix_fadvise() with POSIX_FADV_WILLNEED was exactly
> > >> meant for this purpose?
> > > This is a good idea - I wasn't aware that this was possible.
> > This possibility was the reason for me to propose it. :-)
> posix_fadvise() features in the TODO list already; I'm not sure if any work
> on it has been done for pg8.2.
>
> Anyway, I understand that POSIX_FADV_DONTNEED on a linux 2.6 kernel allows
> pages to be discarded from memory earlier than usual. This is useful, since
> it means you can prevent your seqscan from nuking the OS cache.
>
> Of course you could argue the OS should be able to detect this, and prevent
> it occuring anyway. I don't know anything about linux's behaviour in this
> area.

I recall either monitoring or participating in the discussion when this
call was added to Linux.

I don't believe the kernel can auto-detect that you do not need a page
any longer. It can only prioritize pages to keep when memory is fully
in use and a new page must be loaded. This is often some sort of LRU
scheme. If the page is truly useless, only the application can know.

I'm not convinced that PostgreSQL can know this. The case where it is
useful is if a single process is sequentially scanning a large file
(much larger than memory). As soon as it is more than one process,
or if it is not a sequential scan, or if it is not a large file, this
call hurts more than it gains. Just because I'm done with the page does
not mean that *you* are done with the page.

I'd advise against using this call unless it can be shown that the page
will not be used in the future, or at least, that the page is less useful
than all other pages currently in memory. This is what the call really means.
It means, "There is no value to keeping this page in memory".

Perhaps certain PostgreSQL loads fit this pattern. None of my uses fit
this pattern, and I have trouble believing that a majority of PostgreSQL
loads fits this pattern.

Cheers,
mark

--
mark(at)mielke(dot)cc / markm(at)ncf(dot)ca / markm(at)nortel(dot)com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2006-09-22 03:46:41 Re: Large tables (was: RAID 0 not as fast as
Previous Message Bruce Momjian 2006-09-22 03:05:39 Re: Large tables (was: RAID 0 not as fast as