Re: I/O on select count(*)

From: Matthew Wakeling <matthew(at)flymine(dot)org>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: I/O on select count(*)
Date: 2008-05-15 12:54:13
Message-ID: Pine.LNX.4.64.0805151337460.16756@aragorn.flymine.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, 15 May 2008, Luke Lonergan wrote:
> BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the
> visibility caching which was enough to provide performance at the same level
> as with the HINT bit optimization, but avoids this whole ³write the data,
> write it to the log also, then write it again just for good measure²
> behavior.

This sounds like a good option. I believe I suggested this a few months
ago, however it was rejected because in the worst case (when the hints are
not cached), if you're doing an index scan, you can do twice the number of
seeks as before.

http://archives.postgresql.org/pgsql-performance/2007-12/msg00217.php

The hint data will be four bits per tuple plus overheads, so it could be
made very compact, and therefore likely to stay in the cache fairly well.
Each tuple fetched would have to be spaced really far apart in the
database table in order to exhibit the worst case, because fetching a page
of hint cache will cause 64kB or so of disc to appear in the disc's
read-ahead buffer, which will be equivalent to 128MB worth of database
table (assuming eight tuples per block and no overhead). As soon as you
access another tuple in the same 128MB bracket, you'll hit the disc
read-ahead buffer for the hints.

On balance, to me it still seems like a good option.

Matthew

--
Those who do not understand Unix are condemned to reinvent it, poorly.
-- Henry Spencer

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jan de Visser 2008-05-15 13:15:40 Re: I/O on select count(*)
Previous Message Heikki Linnakangas 2008-05-15 12:52:31 Re: I/O on select count(*)