Re: Very high effective_cache_size == worse performance?

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: David Kerr <dmk(at)mr-paradox(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very high effective_cache_size == worse performance?
Date: 2010-04-20 17:44:18
Message-ID: h2w603c8f071004201044tbc8ce8fenb846b7643a2b9376@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk(at)mr-paradox(dot)net> wrote:
> Howdy all,
>
> I've got a huge server running just postgres. It's got 48 cores and 256GB of ram. Redhat 5.4, Postgres 8.3.9.
> 64bit OS. No users currently.
>
> I've got a J2EE app that loads data into the DB, it's got logic behind it so it's not a simple bulk load, so
> i don't think we can use copy.
>
> Based on the tuning guides, it set my effective_cache_size to 128GB (1/2 the available memory) on the box.
>
> When I ran my load, it took aproximately 15 hours to do load 20 million records. I thought this was odd because
> on a much smaller machine I was able to do that same amount of records in 6 hours.
>
> My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give
> any indication that we had resource issues.
>
> So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size
> from 128GB to 2GB).
>
> Now the large box performs the same as the smaller box. (which is fine).
>
> incidentally, both tests were starting from a blank database.
>
> Is this expected?

Lowering effective_cache_size tends to discourage the planner from
using a nested-loop-with-inner-indexscan plan - that's it.

What may be happening is that you may be loading data into some tables
and then running a query against those tables before the autovacuum
daemon has a chance to analyze them. I suspect that if you enable
some logging you'll find that one of those queries is really, really
slow, and that (by happy coincidence) discouraging it from using the
index it thinks it should use happens to produce a better plan. What
you should probably do is, for each table that you bulk load and then
query, insert a manual ANALYZE between the two.

...Robert

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Kerr 2010-04-20 18:03:51 Re: Very high effective_cache_size == worse performance?
Previous Message Joshua D. Drake 2010-04-20 17:41:36 Re: Very high effective_cache_size == worse performance?