Re: I/O on select count(*)

From: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
To: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "James Mansion" <james(at)mansionfamily(dot)plus(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: I/O on select count(*)
Date: 2008-05-15 22:38:27
Message-ID: 482C753E.EE98.0025.0@wicourts.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>>> On Thu, May 15, 2008 at 5:11 PM, in message
<482CB528(dot)9000600(at)mansionfamily(dot)plus(dot)com>, James Mansion
<james(at)mansionfamily(dot)plus(dot)com> wrote:
> Alvaro Herrera wrote:
>> Hint bits are used to mark tuples as created and/or deleted by
>> transactions that are know committed or aborted. To determine the
>> visibility of a tuple without such bits set, you need to consult
pg_clog
>> and possibly pg_subtrans, so it is an expensive check. On the
other
>>
> So, how come there is this outstanding work to do, which will
inevitably
> be done, and it
> hasn't been done until it is 'just too late' to avoid getting in the
way
> of the query?

There has been discussion from time to time about setting the hint
bits for tuple inserts which occur within the same database
transaction as the creation of the table into which they're being
inserted. That would allow people to cover many of the bulk load
situations. I don't see it on the task list. (I would also argue
that there is little information lost, even from a forensic
perspective, to writing such rows as "frozen".) Is this idea done,
dead, or is someone working on it?

If we could set hint bits on dirty buffer pages after the commit, we'd
cover the OLTP situation. In many situations, there is a bigger OS
cache than PostgreSQL shared memory, and an attempt to set the bits
soon after the commit would coalesce the two writes into one physical
write using RAM-based access, which would be almost as good. I don't
know if it's feasible to try to do that after the pages have moved
from the PostgreSQL cache to the OS cache, but it would likely be a
performance win.

If we are going to burden any requester process with the job of
setting the hint bits, it would typically be better to burden the one
doing the data modification rather than some random thread later
trying to read data from the table. Of course, getting work off the
requester processes onto some background worker process is generally
even better.

-Kevin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message kevin kempter 2008-05-16 06:31:08 Join runs for > 10 hours and then fills up >1.3TB of disk space
Previous Message Gregory Stark 2008-05-15 22:30:41 Re: I/O on select count(*)