Re: Hint Bits and Write I/O

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hint Bits and Write I/O
Date: 2008-05-27 23:32:45
Message-ID: 15901.1211931165@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> My proposal is to have this as a two-stage process. When we set the hint
> on a tuple in a clean buffer we mark it BM_DIRTY_HINTONLY, if not
> already dirty. If we set a hint on a buffer that is BM_DIRTY_HINTONLY
> then we mark it BM_DIRTY.

I wonder if it is worth actually counting the number of newly set hint
bits, rather than just having a counter that saturates at two. We could
steal a byte from usage_count without making the buffer headers bigger.

> If the bgwriter has time, it will write out BM_DIRTY_HINTONLY buffers,
> though on a consistently busy server this should not occur.

What do you mean by "if it has time"? How would it know that?

> This won't change the behaviour of first-read-after-copy. To improve
> that behaviour, I suggest that we only move from BM_DIRTY_HINTONLY to
> BM_DIRTY when we are setting the hint for a new xid. If we are just
> setting the same xid over-and-over again then we should avoid setting
> the page dirty. So when data has been loaded via COPY, we will just
> check the status of the xid once, then scan the whole page using the
> single-item transaction cache.

This doesn't make any sense to me. What is a "new xid"? And what is
"setting the same xid over and over"? If a page is full of occurrences
of the same xid, that doesn't really mean that it's less useful to
correctly hint each occurrence.

The whole proposal seems a bit overly complicated. What we talked about
at PGCon was simply not setting the dirtybit when setting a hint bit.
There's a certain amount of self-optimization there: if a page
continually receives hint bit updates, that also means it is getting
pinned and hence its usage_count stays high, thus it will tend to stay
in shared buffers until something happens to make it really dirty.
(Although that argument might not hold water for a bulk seqscan: you'll
have hinted all the tuples and then very possibly throw the page away
immediately. So counting the hints and eventually deciding we did
enough to justify dirtying the page might be worth doing.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Radek Strnad 2008-05-28 00:22:39 Proposal - Collation at database level
Previous Message Jeff Davis 2008-05-27 22:22:16 Re: Hint Bits and Write I/O

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2008-05-28 08:25:54 Re: Hint Bits and Write I/O
Previous Message Jeff Davis 2008-05-27 22:22:16 Re: Hint Bits and Write I/O