Skip site navigation (1) Skip section navigation (2)

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

From: Thomas F(dot)O'Connell <tfo(at)sitening(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: PgSQL - Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [PATCHES] [HACKERS] ARC Memory Usage analysis
Date: 2004-10-27 00:09:22
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-patchespgsql-performance

As a postgres DBA, I find your comments about how not to use 
effective_cache_size instructive, but I'm still not sure how I should 
arrive at a target value for it.

On most of the machines on which I admin postgres, I generally set 
shared_buffers to 10,000 (using what seems to have been the recent 
conventional wisdom of the lesser of 10,000 or 10% of RAM). I haven't 
really settled on an optimal value for effective_cache_size, and now 
I'm again confused as to how I might even benchmark it.

Here are the documents on which I've based my knowledge:

 From Bruce's document, I gather that effective_cache_size would assume 
that either shared buffers or unused RAM were valid sources of cached 
pages for the purposes of assessing plans.

As a result, I was intending to inflate the value of 
effective_cache_size to closer to the amount of unused RAM on some of 
the machines I admin (once I've verified that they all have a unified 
buffer cache). Is that correct?


Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320

On Oct 26, 2004, at 3:49 AM, Simon Riggs wrote:

> On Mon, 2004-10-25 at 16:34, Jan Wieck wrote:
>> The problem is, with a too small directory ARC cannot guesstimate what
>> might be in the kernel buffers. Nor can it guesstimate what recently 
>> was
>> in the kernel buffers and got pushed out from there. That results in a
>> way too small B1 list, and therefore we don't get B1 hits when in fact
>> the data was found in memory. B1 hits is what increases the T1target,
>> and since we are missing them with a too small directory size, our
>> implementation of ARC is propably using a T2 size larger than the
>> working set. That is not optimal.
> I think I have seen that the T1 list shrinks "too much", but need more
> tests...with some good test results
> The effectiveness of ARC relies upon the balance between the often
> conflicting requirements of "recency" and "frequency". It seems
> possible, even likely, that pgsql's version of ARC may need some subtle
> changes to rebalance it - if we are unlikely enough to find cases where
> it genuinely is out of balance. Many performance tests are required,
> together with a few ideas on extra parameters to include....hence my
> support of Jan's ideas.
> That's also why I called the B1+B2 hit ratio "turbulence" because it
> relates to how much oscillation is happening between T1 and T2. In
> physical systems, we expect the oscillations to be damped, but there is
> no guarantee that we have a nearly critically damped oscillator. (Note
> that the absence of turbulence doesn't imply that T1+T2 is optimally
> sized, just that is balanced).
> [...and all though the discussion has wandered away from my original
> patch...would anybody like to commit, or decline the patch?]
>> If we would replace the dynamic T1 buffers with a max_backends*2 area 
>> of
>> shared buffers, use a C value representing the effective cache size 
>> and
>> limit the T1target on the lower bound to effective cache size - shared
>> buffers, then we basically moved the T1 cache into the OS buffers.
> Limiting the minimum size of T1len to be 2* maxbackends sounds like an
> easy way to prevent overbalancing of T2, but I would like to follow up
> on ways to have T1 naturally stay larger. I'll do a patch with this 
> idea
> in, for testing. I'll call this "T1 minimum size" so we can discuss it.
> Any other patches are welcome...
> It could be that B1 is too small and so we could use a larger value of 
> C
> to keep track of more blocks. I think what is being suggested is two
> GUCs: shared_buffers (as is), plus another one, larger, which would
> allow us to track what is in shared_buffers and what is in OS cache.
> I have comments on "effective cache size" below....
> On Mon, 2004-10-25 at 17:03, Tom Lane wrote:
>> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>>> This all only holds water, if the OS is allowed to swap out shared
>>> memory. And that was my initial question, how likely is it to find 
>>> this
>>> to be true these days?
>> I think it's more likely that not that the OS will consider shared
>> memory to be potentially swappable.  On some platforms there is a 
>> shmctl
>> call you can make to lock your shmem in memory, but (a) we don't use 
>> it
>> and (b) it may well require privileges we haven't got anyway.
> Are you saying we shouldn't, or we don't yet? I simply assumed that we
> did use that function - surely it must be at least an option? RHEL
> supports this at least....
> It may well be that we don't have those privileges, in which case we
> turn off the option. Often, we (or I?) will want to install a dedicated
> server, so we should have all the permissions we need, in which case...
>> This has always been one of the arguments against making 
>> shared_buffers
>> really large, of course --- if the buffers aren't all heavily used, 
>> and
>> the OS decides to swap them to disk, you are worse off than you would
>> have been with a smaller shared_buffers setting.
> Not really, just an argument against making them *too* large. Large
> *and* utilised is OK, so we need ways of judging optimal sizing.
>> However, I'm still really nervous about the idea of using
>> effective_cache_size to control the ARC algorithm.  That number is
>> usually entirely bogus.  Right now it is only a second-order influence
>> on certain planner estimates, and I am afraid to rely on it any more
>> heavily than that.
> ...ah yes, effective_cache_size.
> The manual describes effective_cache_size as if it had something to do
> with the OS, and some of this discussion has picked up on that.
> effective_cache_size is used in only two places in the code (both in 
> the
> planner), as an estimate for calculating the cost of a) nonsequential
> access and b) index access, mainly as a way of avoiding overestimates 
> of
> access costs for small tables.
> There is absolutely no implication in the code that 
> effective_cache_size
> measures anything in the OS; what it gives is an estimate of the number
> of blocks that will be available from *somewhere* in memory (i.e. in
> shared_buffers OR OS cache) for one particular table (the one currently
> being considered by the planner).
> Crucially, the "size" referred to is the size of the *estimate*, not 
> the
> size of the OS cache (nor the size of the OS cache + shared_buffers). 
> So
> setting effective_cache_size = total memory available or setting
> effective_cache_size = total memory - shared_buffers are both wildly
> irrelevant things to do, or any assumption that directly links memory
> size to that parameter. So talking about "effective_cache_size" as if 
> it
> were the OS cache isn't the right thing to do.
> ...It could be that we use a very high % of physical memory as
> shared_buffers - in which case the effective_cache_size would represent
> the contents of shared_buffers.
> Note also that the planner assumes that all tables are equally likely 
> to
> be in cache. Increasing effective_cache_size in postgresql.conf seems
> destined to give the wrong answer in planning unless you absolutely
> understand what it does.
> I will submit a patch to correct the description in the manual.
> Further comments:
> The two estimates appear to use effective_cache_size differently:
> a) assumes that a table of size effective_cache_size will be 50% in
> cache
> b) assumes that effective_cache_size blocks are available, so for a
> table of size == effective_cache_size, then it will be 100% available
> IMHO the GUC should be renamed "estimated_cached_blocks", with the old
> name deprecated to force people to re-read the manual description of
> what effective_cache_size means and then set accordingly.....all of 
> that
> in 8.0....
> -- 
> Best Regards, Simon Riggs
> ---------------------------(end of 
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to 
> majordomo(at)postgresql(dot)org)

In response to


pgsql-performance by date

Next:From: Josh BerkusDate: 2004-10-27 00:39:59
Subject: Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous:From: Curtis ZinzilietaDate: 2004-10-26 23:04:35
Subject: Re: can't handle large number of INSERT/UPDATEs

pgsql-hackers by date

Next:From: Josh BerkusDate: 2004-10-27 00:39:59
Subject: Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous:From: Josh BerkusDate: 2004-10-26 23:39:46
Subject: Re: DBT-3 Query 2 EXPLAIN ANALYZE differences

pgsql-patches by date

Next:From: Josh BerkusDate: 2004-10-27 00:39:59
Subject: Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous:From: Reini UrbanDate: 2004-10-26 22:07:47
Subject: Re: rmtree() failure on Windows

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group