Skip site navigation (1) Skip section navigation (2)

Re: Choice of bitmap scan over index scan

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Mathieu De Zutter" <mathieu(at)dezutter(dot)org>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Choice of bitmap scan over index scan
Date: 2010-01-11 15:11:12
Message-ID: 4B4AEB30020000250002E16C@gw.wicourts.gov (view raw or flat)
Thread:
Lists: pgsql-performance
Mathieu De Zutter <mathieu(at)dezutter(dot)org> wrote:
 
> So if this query usually does *not* hit the cache, it will be
> probably faster if I leave it like that? While testing a query I
> execute it that much that it's always getting into the cache.
> However, since other applications run on the same server, I think
> that infrequently used data gets flushed after a while, even if
> the DB could fit in the RAM.
 
You definitely were hitting the cache almost exclusively in the
EXPLAIN ANALYZE results you sent.  If that's not typically what
happens, we'd be able to better analyze the situation with an
EXPLAIN ANALYZE of a more typical run.  That said, if you are doing
physical reads, reading backwards on the index is going to degrade
pretty quickly if you're using a normal rotating magnetic medium,
because the blocks are arranged on the disk in a format to support
fast reads in a forward direction.  Given that and other factors,
the bitmap scan will probably be much faster if you do wind up going
to the disk most of the time.
 
On the other hand, there's no reason to lie to the optimizer about
how much memory is on the machine.  You can't expect it to make sane
choices on the basis of misleading assumptions.  For starters, try
setting effective_cache_size to at least 1GB.  That doesn't reserve
any space, it just tells the optimizer what it can assume about how
much data can be cached, and a large setting will tend to encourage
more indexed access.
 
Given that when you focus on one piece of the database, the caching
effects are pretty dramatic, you really should reduce
random_page_cost to 2, even with the in-and-out-of-cache scenario
you describe.  These aren't "magic bullets" that solve all
performance problems, but you would be giving the optimizer a
fighting chance at costing plans in a way that the one with the
lowest calculated cost is actually the one which will run the
fastest.
 
Also, if the pressure on RAM is that high on this machine, it would
probably be cost-effective to add another 2GB of RAM, so that you
could be a bit more generous in your allocation of RAM to the
database.  It might make your problems queries an order of magnitude
or more faster with very little effort.  If a quick google search is
any indication, you can get that much RAM for less than $50 these
days, if you've got a slot open.
 
-Kevin

In response to

pgsql-performance by date

Next:From: Bob DusekDate: 2010-01-11 16:42:03
Subject: Re: performance config help
Previous:From: A. KretschmerDate: 2010-01-11 14:07:53
Subject: Re: performance config help

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group