Re: WIP patch for hint bit i/o mitigation

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Merlin Moncure'" <mmoncure(at)gmail(dot)com>
Cc: "'Atri Sharma'" <atri(dot)jiit(at)gmail(dot)com>, "'PostgreSQL-development'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP patch for hint bit i/o mitigation
Date: 2012-11-15 16:25:05
Message-ID: 00eb01cdc34d$c608b550$521a1ff0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thursday, November 15, 2012 9:27 PM Merlin Moncure wrote:
> On Thu, Nov 15, 2012 at 4:39 AM, Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
> wrote:
> >>In each visibility function (except HeapTupleSatisfiesVacuum() ), an
> >> addition check has been added to check if the commit status of Xmin
> or Xmax
> >> of a tuple can be >retrieved from the cache.
> >
> >
> >
> > 1. From your explanation and code, it is quite clear that it will
> > certainly give performance benefits in the scenario's mentioned by
> you.
> >
> > I can once validate the performance numbers again and do the
> code
> > review for this patch during CF-3.
> >
> > However I am just not very sure about the use case, such that
> whether
> > it is a sufficient use case.
> >
> > So I would like to ask opinion of other people as well.
>
> sure. I'd like to note though that hint bit i/o is a somewhat common
> complaint. it tends to most affect OLAP style workloads. in
> pathological workloads, it can really burn you -- it's not fun when
> you are i/o starved via sequential scan. This can still happen when
> sweeping dead records (which this patch doesn't deal with, though
> maybe it should).
>
> > 2. After this patch, tuple hint bit is not set by Select operations
> after
> > data populated by one transaction.
> >
> > This appears to be good as it will save many ops (page dirty
> followed
> > by flush , clog inquiry).
>
> Technically it does not save clog fetch as transam.c has a very
> similar cache mechanism. However, it does save a page write i/o and a
> lock on the page header, as well as a couple of other minor things.
> In the best case, the page write is completely masked as the page gets
> dirty for other reasons. I think this is going to become more
> important in checksum enabled scenarios.
>
> > Though it is no apparent, however we should see whether it can
> cause
> > any other impact due to this:
> >
> > a. like may be now VACUUM needs set the hint bit which may
> cause more
> > I/O during Vacuum.
>
> IMNSHO. deferring non-critical i/o from foreground process to
> background process is generally good.

Yes, in regard of deferring you are right.
However in this case may be when foreground process has to mark page dirty
due to hint-bit, it was already dirty so no extra I/O, but when it is done
by VACUUM, page may not be dirty.

Also due to below points, doing it in VACUUM may cost more:
a. VACUUM has ring-buffer of fixed size and if such pages are many then
write of page needs to be done by VACUUM to replace existing page
in ring.
b. Considering sometimes people want VACUUM to run when system is not busy,
the chances of generating more overall I/O in system can be
more.

Why we can't avoid setting hint-bit in VACUUM?
Is it due to reason that it has to be done in some way, so that hint-bits
are properly set.
Or may be I am missing something trivial?

Though the case VACUUM, I am talking occurs very less in practical, but the
point came to my mind,
so I thought of sharing with you.

> VACUUM has nice features like
> i/o throttling and in place cancel so latent management of internal
> page optimization flags really belong there ideally. Also, the longer
> you defer such I/O the more opportunity there is for it to be masked
> off by some other page dirtying operation (again, this is more
> important in the face of having to log hint bit changes).
>
> There could be some good rebuttal analysis though.

With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-11-15 16:27:02 Re: [PATCH] binary heap implementation
Previous Message Alvaro Herrera 2012-11-15 16:23:34 Re: [PATCH 02/14] Add support for a generic wal reading facility dubbed XLogReader