Quick Links

Re: [PATCHES] [HACKERS] ARC Memory Usage analysis

From:	Thomas F(dot)O'Connell <tfo(at)sitening(dot)com>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	PgSQL - Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: [PATCHES] [HACKERS] ARC Memory Usage analysis
Date:	2004-10-27 00:09:22
Message-ID:	74778BDA-27AC-11D9-B369-000D93AE0944@sitening.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-patches pgsql-performance

Simon,

As a postgres DBA, I find your comments about how not to use
effective_cache_size instructive, but I'm still not sure how I should
arrive at a target value for it.

On most of the machines on which I admin postgres, I generally set
shared_buffers to 10,000 (using what seems to have been the recent
conventional wisdom of the lesser of 10,000 or 10% of RAM). I haven't
really settled on an optimal value for effective_cache_size, and now
I'm again confused as to how I might even benchmark it.

Here are the documents on which I've based my knowledge:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#effcache
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
http://www.ca.postgresql.org/docs/momjian/hw_performance/node8.html

From Bruce's document, I gather that effective_cache_size would assume
that either shared buffers or unused RAM were valid sources of cached
pages for the purposes of assessing plans.

As a result, I was intending to inflate the value of
effective_cache_size to closer to the amount of unused RAM on some of
the machines I admin (once I've verified that they all have a unified
buffer cache). Is that correct?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Oct 26, 2004, at 3:49 AM, Simon Riggs wrote:

> On Mon, 2004-10-25 at 16:34, Jan Wieck wrote:
>> The problem is, with a too small directory ARC cannot guesstimate what
>> might be in the kernel buffers. Nor can it guesstimate what recently
>> was
>> in the kernel buffers and got pushed out from there. That results in a
>> way too small B1 list, and therefore we don't get B1 hits when in fact
>> the data was found in memory. B1 hits is what increases the T1target,
>> and since we are missing them with a too small directory size, our
>> implementation of ARC is propably using a T2 size larger than the
>> working set. That is not optimal.
>
> I think I have seen that the T1 list shrinks "too much", but need more
> tests...with some good test results
>
> The effectiveness of ARC relies upon the balance between the often
> conflicting requirements of "recency" and "frequency". It seems
> possible, even likely, that pgsql's version of ARC may need some subtle
> changes to rebalance it - if we are unlikely enough to find cases where
> it genuinely is out of balance. Many performance tests are required,
> together with a few ideas on extra parameters to include....hence my
> support of Jan's ideas.
>
> That's also why I called the B1+B2 hit ratio "turbulence" because it
> relates to how much oscillation is happening between T1 and T2. In
> physical systems, we expect the oscillations to be damped, but there is
> no guarantee that we have a nearly critically damped oscillator. (Note
> that the absence of turbulence doesn't imply that T1+T2 is optimally
> sized, just that is balanced).
>
> [...and all though the discussion has wandered away from my original
> patch...would anybody like to commit, or decline the patch?]
>
>> If we would replace the dynamic T1 buffers with a max_backends*2 area
>> of
>> shared buffers, use a C value representing the effective cache size
>> and
>> limit the T1target on the lower bound to effective cache size - shared
>> buffers, then we basically moved the T1 cache into the OS buffers.
>
> Limiting the minimum size of T1len to be 2* maxbackends sounds like an
> easy way to prevent overbalancing of T2, but I would like to follow up
> on ways to have T1 naturally stay larger. I'll do a patch with this
> idea
> in, for testing. I'll call this "T1 minimum size" so we can discuss it.
>
> Any other patches are welcome...
>
> It could be that B1 is too small and so we could use a larger value of
> C
> to keep track of more blocks. I think what is being suggested is two
> GUCs: shared_buffers (as is), plus another one, larger, which would
> allow us to track what is in shared_buffers and what is in OS cache.
>
> I have comments on "effective cache size" below....
>
> On Mon, 2004-10-25 at 17:03, Tom Lane wrote:
>> Jan Wieck <JanWieck(at)Yahoo(dot)com> writes:
>>> This all only holds water, if the OS is allowed to swap out shared
>>> memory. And that was my initial question, how likely is it to find
>>> this
>>> to be true these days?
>>
>> I think it's more likely that not that the OS will consider shared
>> memory to be potentially swappable. On some platforms there is a
>> shmctl
>> call you can make to lock your shmem in memory, but (a) we don't use
>> it
>> and (b) it may well require privileges we haven't got anyway.
>
> Are you saying we shouldn't, or we don't yet? I simply assumed that we
> did use that function - surely it must be at least an option? RHEL
> supports this at least....
>
> It may well be that we don't have those privileges, in which case we
> turn off the option. Often, we (or I?) will want to install a dedicated
> server, so we should have all the permissions we need, in which case...
>
>> This has always been one of the arguments against making
>> shared_buffers
>> really large, of course --- if the buffers aren't all heavily used,
>> and
>> the OS decides to swap them to disk, you are worse off than you would
>> have been with a smaller shared_buffers setting.
>
> Not really, just an argument against making them *too* large. Large
> *and* utilised is OK, so we need ways of judging optimal sizing.
>
>> However, I'm still really nervous about the idea of using
>> effective_cache_size to control the ARC algorithm. That number is
>> usually entirely bogus. Right now it is only a second-order influence
>> on certain planner estimates, and I am afraid to rely on it any more
>> heavily than that.
>
> ...ah yes, effective_cache_size.
>
> The manual describes effective_cache_size as if it had something to do
> with the OS, and some of this discussion has picked up on that.
>
> effective_cache_size is used in only two places in the code (both in
> the
> planner), as an estimate for calculating the cost of a) nonsequential
> access and b) index access, mainly as a way of avoiding overestimates
> of
> access costs for small tables.
>
> There is absolutely no implication in the code that
> effective_cache_size
> measures anything in the OS; what it gives is an estimate of the number
> of blocks that will be available from *somewhere* in memory (i.e. in
> shared_buffers OR OS cache) for one particular table (the one currently
> being considered by the planner).
>
> Crucially, the "size" referred to is the size of the *estimate*, not
> the
> size of the OS cache (nor the size of the OS cache + shared_buffers).
> So
> setting effective_cache_size = total memory available or setting
> effective_cache_size = total memory - shared_buffers are both wildly
> irrelevant things to do, or any assumption that directly links memory
> size to that parameter. So talking about "effective_cache_size" as if
> it
> were the OS cache isn't the right thing to do.
>
> ...It could be that we use a very high % of physical memory as
> shared_buffers - in which case the effective_cache_size would represent
> the contents of shared_buffers.
>
> Note also that the planner assumes that all tables are equally likely
> to
> be in cache. Increasing effective_cache_size in postgresql.conf seems
> destined to give the wrong answer in planning unless you absolutely
> understand what it does.
>
> I will submit a patch to correct the description in the manual.
>
> Further comments:
> The two estimates appear to use effective_cache_size differently:
> a) assumes that a table of size effective_cache_size will be 50% in
> cache
> b) assumes that effective_cache_size blocks are available, so for a
> table of size == effective_cache_size, then it will be 100% available
>
> IMHO the GUC should be renamed "estimated_cached_blocks", with the old
> name deprecated to force people to re-read the manual description of
> what effective_cache_size means and then set accordingly.....all of
> that
> in 8.0....
>
> --
> Best Regards, Simon Riggs
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to
> majordomo(at)postgresql(dot)org)

In response to

Re: [PATCHES] ARC Memory Usage analysis at 2004-10-26 08:49:15 from Simon Riggs

Responses

Re: [PERFORM] [PATCHES] ARC Memory Usage analysis at 2004-10-27 00:39:59 from Josh Berkus
Re: [PATCHES] [HACKERS] ARC Memory Usage analysis at 2004-11-01 01:11:45 from Gaetano Mendola

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Josh Berkus	2004-10-27 00:39:59	Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous Message	Josh Berkus	2004-10-26 23:39:46	Re: DBT-3 Query 2 EXPLAIN ANALYZE differences

Browse pgsql-patches by date

	From	Date	Subject
Next Message	Josh Berkus	2004-10-27 00:39:59	Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous Message	Reini Urban	2004-10-26 22:07:47	Re: rmtree() failure on Windows

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Josh Berkus	2004-10-27 00:39:59	Re: [PERFORM] [PATCHES] ARC Memory Usage analysis
Previous Message	Curtis Zinzilieta	2004-10-26 23:04:35	Re: can't handle large number of INSERT/UPDATEs