Skip site navigation (1) Skip section navigation (2)

Re: I/O on select count(*)

From: James Mansion <james(at)mansionfamily(dot)plus(dot)com>
To: Alvaro Herrera <alvherre(at)commandprompt(dot)com>
Cc: Greg Smith <gsmith(at)gregsmith(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: I/O on select count(*)
Date: 2008-05-15 22:11:52
Message-ID: 482CB528.9000600@mansionfamily.plus.com (view raw or flat)
Thread:
Lists: pgsql-performance
Alvaro Herrera wrote:
> Hint bits are used to mark tuples as created and/or deleted by
> transactions that are know committed or aborted.  To determine the
> visibility of a tuple without such bits set, you need to consult pg_clog
> and possibly pg_subtrans, so it is an expensive check.  On the other
>   
So, how come there is this outstanding work to do, which will inevitably 
be done, and it
hasn't been done until it is 'just too late' to avoid getting in the way 
of the query?

The OP didn't suggest that he had just loaded the data.

Also - is it the case that this only affects the case where updated 
pages were spilled
during the transaction that changed them?  ie, if we commit a 
transaction and there
are changed rows still in the cache since their pages are not evicted 
yet, are the hint
bits set immediately so that page is written just once?  Seems this 
would be common
in most OLTP systems.

Heikki points out that the list might get big and need to be abandoned, 
but then you
fall back to scheduling a clog scan that can apply the bits, which does 
what you have
now, though hopefully in a way that fills slack disk IO rather than 
waiting for the
read.

Matthew says: 'it would be a list of changes since the last checkpoint' 
but I don't
see why you can't start writing hints to in-memory pages as soon as the 
transaction
ends.  You might fall behind, but I doubt it with modern CPU speeds.

I can't see why Pavan's suggestion to try to update as many of the bits 
as possible
when a dirty page is evicted would be contentious.

I do think this is something of interest to users, not just developers, 
since it
may influence the way updates are processed where it is reasonable to do
so in 'bite sized chunks' as a multipart workflow.



In response to

Responses

pgsql-performance by date

Next:From: Gregory StarkDate: 2008-05-15 22:30:41
Subject: Re: I/O on select count(*)
Previous:From: Robert LorDate: 2008-05-15 21:23:10
Subject: Re: I/O on select count(*)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group