Re: Very high effective_cache_size == worse performance?

From: David Kerr <dmk(at)mr-paradox(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very high effective_cache_size == worse performance?
Date: 2010-04-20 18:03:51
Message-ID: 20100420180351.GA51940@mr-paradox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Apr 20, 2010 at 01:44:18PM -0400, Robert Haas wrote:
- On Tue, Apr 20, 2010 at 1:39 PM, David Kerr <dmk(at)mr-paradox(dot)net> wrote:
- > My initial thought was hardware issues so we got sar, vmstat, etc all running on the box and they didn't give
- > any indication that we had resource issues.
- >
- > So I decided to just make the 2 PG config files look the same. (the only change was dropping effective_cache_size
- > from 128GB to 2GB).
- >
- > Now the large box performs the same as the smaller box. (which is fine).
- >
- > incidentally, both tests were starting from a blank database.
- >
- > Is this expected?
-
- Lowering effective_cache_size tends to discourage the planner from
- using a nested-loop-with-inner-indexscan plan - that's it.
-
- What may be happening is that you may be loading data into some tables
- and then running a query against those tables before the autovacuum
- daemon has a chance to analyze them. I suspect that if you enable
- some logging you'll find that one of those queries is really, really
- slow, and that (by happy coincidence) discouraging it from using the
- index it thinks it should use happens to produce a better plan. What
- you should probably do is, for each table that you bulk load and then
- query, insert a manual ANALYZE between the two.
-
- ...Robert
-

that thought occured to me while I was testing this. I ran a vacuumdb -z
on my database during the load and it didn't impact performance at all.

Incidentally the code is written to work like this :

while (read X lines in file){
Process those lines.
write lines to DB.
}

So i would generally expect to get the benefits of the updated staticis
once the loop ended. no? (would prepared statements affect that possibly?)

Also, while I was debugging the problem, I did load a 2nd file into the DB
ontop of one that had been loaded. So the statistics almost certinaly should
have been decent at that point.

I did turn on log_min_duration_statement but that caused performance to be unbearable,
but i could turn it on again if it would help.

Dave

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Nikolas Everett 2010-04-20 18:12:15 Re: Very high effective_cache_size == worse performance?
Previous Message Robert Haas 2010-04-20 17:44:18 Re: Very high effective_cache_size == worse performance?