Re: I/O on select count(*)

From: Decibel! <decibel(at)decibel(dot)org>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: I/O on select count(*)
Date: 2008-05-24 19:06:56
Message-ID: E18CFDAC-00FC-4107-ADD2-53AC09041BEE@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On May 18, 2008, at 1:28 AM, Greg Smith wrote:
> I just collected all the good internals information included in
> this thread and popped it onto http://wiki.postgresql.org/wiki/
> Hint_Bits where I'll continue to hack away at the text until it's
> readable. Thanks to everyone who answered my questions here,
> that's good progress toward clearing up a very underdocumented area.
>
> I note a couple of potential TODO items not on the official list
> yet that came up during this discussion:
>
> -Smooth latency spikes when switching commit log pages by
> preallocating cleared pages before they are needed
>
> -Improve bulk loading by setting "frozen" hint bits for tuple
> inserts which occur within the same database transaction as the
> creation of the table into which they're being inserted
>
> Did I miss anything? I think everything brought up falls either
> into one of those two or the existing "Consider having the
> background writer update the transaction status hint bits..." TODO.

-Evaluate impact of improved caching of CLOG per Greenplum:

Per Luke Longergan:
I'll find out if we can extract our code that did the work. It was
simple but scattered in a few routines. In concept it worked like this:

1 - Ignore if hint bits are unset, use them if set. This affects
heapam and vacuum I think.
2 - implement a cache for clog lookups based on the optimistic
assumption that the data was inserted in bulk. Put the cache one
call away from heapgetnext()

I forget the details of (2). As I recall, if we fall off of the
assumption, the penalty for long scans get large-ish (maybe 2X), but
since when do people full table scan when they're updates/inserts are
so scattered across TIDs? It's an obvious big win for DW work.

--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Decibel! 2008-05-24 19:15:34 Re: I/O on select count(*)
Previous Message Decibel! 2008-05-24 18:49:46 Re: Posible planner improvement?