Skip site navigation (1) Skip section navigation (2)

Re: reducing random_page_cost from 4 to 2 to force index scan

From: Jim Nasby <jim(at)nasby(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Krogh <jesper(at)krogh(dot)cc>, pgsql-performance(at)postgresql(dot)org
Subject: Re: reducing random_page_cost from 4 to 2 to force index scan
Date: 2011-05-17 19:19:02
Message-ID: B91EFACD-9560-45CB-8614-120D57B5A2F0@nasby.net (view raw or flat)
Thread:
Lists: pgsql-performance
On May 16, 2011, at 10:46 AM, Tom Lane wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Mon, May 16, 2011 at 12:49 AM, Jesper Krogh <jesper(at)krogh(dot)cc> wrote:
>>> Ok, it may not work as well with index'es, since having 1% in cache may very
>>> well mean that 90% of all requested blocks are there.. for tables in should
>>> be more trivial.
> 
>> Tables can have hot spots, too.  Consider a table that holds calendar
>> reservations.  Reservations can be inserted, updated, deleted.  But
>> typically, the most recent data will be what is most actively
>> modified, and the older data will be relatively more (though not
>> completely) static, and less frequently accessed.  Such examples are
>> common in many real-world applications.
> 
> Yes.  I'm not convinced that measuring the fraction of a table or index
> that's in cache is really going to help us much.  Historical cache hit
> rates might be useful, but only to the extent that the incoming query
> has a similar access pattern to those in the (recent?) past.  It's not
> an easy problem.
> 
> I almost wonder if we should not try to measure this at all, but instead
> let the DBA set a per-table or per-index number to use, analogous to the
> override we added recently for column n-distinct statistics ...

I think the challenge there would be how to define the scope of the hot-spot. Is it the last X pages? Last X serial values? Something like correlation?

Hmm... it would be interesting if we had average relation access times for each stats bucket on a per-column basis; that would give the planner a better idea of how much IO overhead there would be for a given WHERE clause.
--
Jim C. Nasby, Database Architect                   jim(at)nasby(dot)net
512.569.9461 (cell)                         http://jim.nasby.net



In response to

Responses

pgsql-performance by date

Next:From: Jim NasbyDate: 2011-05-17 19:30:46
Subject: Re: KVP table vs. hstore - hstore performance (Was: Postgres NoSQL emulation)
Previous:From: STADate: 2011-05-17 17:44:28
Subject: Re: Modifying shared_buffers causes despite plenty of ram

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group