Re: Memory usage - indexes

From: Greg Smith <greg(at)2ndquadrant(dot)com>
To: Tobias Brox <tobixen(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Memory usage - indexes
Date: 2010-09-24 22:00:41
Message-ID: 4C9D1F89.6070006@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tobias Brox wrote:
>> I have easily measurable improvements on client systems increasing
>> shared_buffers into the 4GB - 8GB range. Popular indexes move into there,
>> stay there, and only get written out at checkpoint time rather than all the
>> time.
>>
>
> Ours is at 12 GB, out of 70 GB total RAM.
>

Get a snapshot of what's in there using pg_buffercache. And then reduce
that to at or under 8GB. Everyone I've seen test in this area says that
performance starts to drop badly with shared_buffers greater than
somewhere between 8GB and 10GB, so 12GB is well into the area where it's
degraded already.

> Huh ... does it mean Windows have better OS cache handling than Linux?
> To me it sounds insane to run a database under a buggy GUI ... but I
> suppose I should keep that to myself :-)
>

No, windows has slow shared memory issues when used the way PostgreSQL
does, so you push at the OS cache instead as the next best thing.

> Linear relationship between the time it takes to do index lookups vs
> the fatness of the index? That's not what I guessed in the first
> place ... but I suppose you're right.
>

If you're scanning 10% of a 10GB index, you can bet that's going to take
longer to do than scanning 10% of a 5GB index. So unless the bigger
index is significantly adding to how selective the query is--so that you
are, say, only scanning 2% of the 10GB index because indexing on two
rows allowed you to remove many candidate rows--you might as well use a
slimmer one instead.

Overindexed tables containing more columns than are actually selective
is a very popular source of PostgreSQL slowdowns. It's easy to say "oh,
I look this data up using columns a,b,c, so lets put an index on
a,b,c". But if an index on a alone is 1% selective, that's probably
wrong; just index it instead, so that you have one lean, easy to
maintain index there that's more likely to be in RAM at all times. Let
the CPU chew on filtering out which of those 1% matches also match the
(b,c) criteria instead.

Obviously rough guidance here--you need to simulate to know for sure.
Every drop an index in a transaction block just to see how a query plan
changes if it's not there anymore, then rollback so it never really went
away? Great fun for this sort of experiment, try it sometime.

> Sometimes it takes more work to implement work already done by others
> than to reimplement the logics ... but anyway, I will have a look
> before I make more snapshot tables ;-)
>

You will be surprised at how exactly you are reimplementing that
particular project.

--
Greg Smith, 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services and Support www.2ndQuadrant.us
Author, "PostgreSQL 9.0 High Performance" Pre-ordering at:
https://www.packtpub.com/postgresql-9-0-high-performance/book

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tobias Brox 2010-09-25 10:29:30 Re: Memory usage - indexes
Previous Message Tobias Brox 2010-09-24 20:50:17 Re: Memory usage - indexes