Re: [WIP] cache estimates, cache access cost

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Greg Smith" <greg(at)2ndQuadrant(dot)com>
Cc: <cedric(dot)villemain(dot)debian(at)gmail(dot)com>,<robertmhaas(at)gmail(dot)com>, <stark(at)mit(dot)edu>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] cache estimates, cache access cost
Date: 2011-06-20 16:14:17
Message-ID: 4DFF2B89020000250003E9B4@gw.wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith <greg(at)2ndQuadrant(dot)com> wrote:
> On 06/19/2011 06:15 PM, Kevin Grittner wrote:
>> I think the point is that if, on a fresh system, the first access
>> to a table is something which uses a tables scan -- like select
>> count(*) -- that all indexed access would then tend to be
>> suppressed for that table. After all, for each individual query,
>> selfishly looking at its own needs in isolation, it likely
>> *would* be faster to use the cached heap data.
>
> If those accesses can compete with other activity, such that the
> data really does stay in the cache rather than being evicted, then
> what's wrong with that?

The problem is that if somehow the index *does* find its way into
cache, the queries might all run an order of magnitude faster by
using it. The *first* query to bite the bullet and read through the
index wouldn't, of course, since it would have all that random disk
access. But its not hard to imagine an application mix where this
feature could cause a surprising ten-fold performance drop after
someone does a table scan which could persist indefinitely. I'm not
risking that in production without a clear mechanism to
automatically recover from that sort of cache skew.

-Kevin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-06-20 16:16:24 Re: proposal: a validator for configuration files
Previous Message Andres Freund 2011-06-20 15:36:42 Re: POSIX question