Re: Large tables (was: RAID 0 not as fast as

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: mark(at)mark(dot)mielke(dot)cc, Guy Thornley <guy(at)esphion(dot)com>, Markus Schaber <schabi(at)logix-tt(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Large tables (was: RAID 0 not as fast as
Date: 2006-09-22 14:01:14
Message-ID: 20060922140113.GW28987@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Sep 21, 2006 at 08:46:41PM -0700, Luke Lonergan wrote:
> Mark,
>
> On 9/21/06 8:40 PM, "mark(at)mark(dot)mielke(dot)cc" <mark(at)mark(dot)mielke(dot)cc> wrote:
>
> > I'd advise against using this call unless it can be shown that the page
> > will not be used in the future, or at least, that the page is less useful
> > than all other pages currently in memory. This is what the call really means.
> > It means, "There is no value to keeping this page in memory".
>
> Yes, it's a bit subtle.
>
> I think the topic is similar to "cache bypass", used in cache capable vector
> processors (Cray, Convex, Multiflow, etc) in the 90's. When you are
> scanning through something larger than the cache, it should be marked
> "non-cacheable" and bypass caching altogether. This avoids a copy, and
> keeps the cache available for things that can benefit from it.
>
> WRT the PG buffer cache, the rule would have to be: "if the heap scan is
> going to be larger than "effective_cache_size", then issue the
> posix_fadvise(BLOCK_NOT_NEEDED) call". It doesn't sound very efficient to
> do this in block/extent increments though, and it would possibly mess with
> subsets of the block space that would be re-used for other queries.

Another issue is that if you start two large seqscans on the same table
at about the same time, right now you should only be issuing one set of
reads for both requests, because one of them will just pull the blocks
back out of cache. If we weren't caching then each query would have to
physically read (which would be horrid).

There's been talk of adding code that would have a seqscan detect if
another seqscan is happening on the table at the same time, and if it
is, to start it's seqscan wherever the other seqscan is currently
running. That would probably ensure that we weren't reading from the
table in 2 different places, even if we weren't caching.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim C. Nasby 2006-09-22 14:02:29 Re: Large tables (was: RAID 0 not as fast as
Previous Message Markus Schaber 2006-09-22 09:13:01 Re: Large tables (was: RAID 0 not as fast as