Re: Hint Bits and Write I/O

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hint Bits and Write I/O
Date: 2008-05-27 22:10:42
Message-ID: 1211926242.4489.301.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-05-27 at 23:28 +0200, Florian G. Pflug wrote:
> Simon Riggs wrote:
> > After some discussions at PGCon, I'd like to make some proposals for
> > hint bit setting with the aim to reduce write overhead.
> >
> > Currently, when we see an un-hinted row we set the bit, if possible and
> > then dirty the block.
> >
> > If we were to set the bit but *not* dirty the block we may be able to
> > find a reduction in I/O. In many cases this would make no difference at
> > all, since we often set hints on an already dirty block. In other cases,
> > particularly random INSERTs, UPDATEs and DELETEs against large tables
> > this would reduce I/O, though possibly increase accesses to clog.
>
> Hm, but the io overhead of hit-bit setting occurs only once, while the
> pressure on the clog is increased until we set the hint-bit. This looks
> like not writing the hit-bit update to disk results in worse throughput
> unless there are many updated, and only very few selects. But not too
> many updates either, because if a page gets hit by tuple updates faster
> than the bgwriter writes it out, you won't waste any io on hit-bit-only
> writes either. That might turn out to be a pretty slim window which
> actually shows substantial IO savings...
>
> > My proposal is to have this as a two-stage process. When we set the hint
> > on a tuple in a clean buffer we mark it BM_DIRTY_HINTONLY, if not
> > already dirty. If we set a hint on a buffer that is BM_DIRTY_HINTONLY
> > then we mark it BM_DIRTY.
> >
> > The objective of this is to remove effects of single index accesses.
> So effectively, only the first hit-bit update hitting a previously clean
> buffer gets treated specially - the second hit-bit update flags the
> buffer as dirty, just as it does now? That sounds a bit strange - why is
> it exactly the *second* write that triggers the dirtying? Or did I
> missunderstand what you wrote?

Hmm, I think the question is: How many hint bits need to be set before
we mark the buffer dirty? (N)

Should it be 1, as it is now? Should it be never? Never is a long time.
As N increases, clog accesses increase. So it would seem there is likely
to be an optimal value for N.

Each buffer read into shared_buffers will stay there for a certain
period of time. During that time, how many hint bits will be set on
otherwise clean blocks? We can draw that as a frequency distribution of
the number of hint bit set operations before the block leaves
shared_buffers. In a small database, the % of blocks with #hint bits
sets = 1 is very low, since we expect the blocks to stay in cache for
long periods. In a large database, the % of blocks with #hint bit sets =
1 increases dramatically, since the cache churns more quickly and the
frequency of access to each block *may* be lower. If we dirty only when
#hint bit sets >= 2 then we will remove a large proportion of I/O from
random selects/updates.

Remember that we are setting the hint bit on the tuples in buffers, just
not setting BM_DIRTY quickly. So if we have just a single bit set, but
many buffer accesses we perform no additional I/O, nor additional clog
access.

So, based on all of the above:
* For large databases, values of N=2 seem appropriate.
* For small databases, values of N=1 seem appropriate.

Perhaps we can vary this according to the size of database/table?

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2008-05-27 22:22:16 Re: Hint Bits and Write I/O
Previous Message Tom Lane 2008-05-27 22:02:48 Re: ERRORDATA_STACK_SIZE panic crashes on Windows

Browse pgsql-patches by date

  From Date Subject
Next Message Jeff Davis 2008-05-27 22:22:16 Re: Hint Bits and Write I/O
Previous Message Florian G. Pflug 2008-05-27 21:28:08 Re: Hint Bits and Write I/O