Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: Set effective_cache_size to greater of .conf value, shared_buffers
Date: 2013-01-10 18:44:59
Message-ID: CAMkU=1xRbDG81mpjOPMFinaJAjoigFHUvkN2HN+mAaaiLLLQgw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 9, 2013 at 12:38 AM, Benedikt Grundmann
<bgrundmann(at)janestreet(dot)com> wrote:

> For what it is worth even if it is a dedicated database box 75% might be way
> too high. I remember investigating bad performance on our biggest database
> server, that in the end turned out to be a too high setting of
> effective_cache_size. From reading the code back then my rationale for it
> being to high was that the code that makes use of the effective_cache_size
> tries very hard to account for what the current query would do to the cache
> but doesn't take into account how many queries (on separate datasets!) are
> currently begin executed (and competing for the same cache). On that box we
> often have 100+ active connections and many looking at different big
> datasets.

I think that most busy installations either run a lot of small queries
(for which effective_cache_size is irrelevant), or a few large
queries. Your case is probably somewhat rare, and so as far as
defaults go, it would be sacrificed for the common good. The docs do
anticipate the need to account for multiple concurrent queries to be
discounted in deciding how to set effective_cache_size, but perhaps
the wording could be improved.

Out of curiosity, what did your queries look like after you lowered
effective_cache_size? Were there a lot of sequential scans, or did it
just choose different indexes than it had before? If a lot of
sequential scans, were they mostly on just a few tables that each had
many sequential scans going on simultaneously, or was it 100+
different tables each with one sequential scan going on? (You said
different big datasets, but I don't know if these are in different
tables, or in common tables with a column to distinguish them.)

Cheers,

Jeff

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2013-01-10 19:04:03 Re: Enabling Checksums
Previous Message Simon Riggs 2013-01-10 18:05:23 Re: Save The Date: Cluster-Hackers meeting May 21st