On Wed, Jun 29, 2011 at 9:09 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On Sat, Apr 2, 2011 at 8:40 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Merlin Moncure <mmoncure(at)gmail(dot)com> writes:
>>> Moving TransactionIdInProgress below TransactionIdDidCommit can help
>>> in once sense: TransactionIdDidCommit grabs the XidStatus but discards
>>> the knowledge if the transaction is known aborted.
>> Doesn't the single-entry cache help with that?
> Yes, it does.
correct. that line of analysis was refuted both in terms of benefit
and on structural grounds by Tom.
> I'm a little worried by some of the statements around this patch. It
> would be good to get a single clear analysis, then build the patch
> from that.
> * TransactionIdDidCommit grabs XidStatus and then keeps it - if the
> xact is known aborted or known committed.
> * In the top of the patch you mention that "It's tempting to try and
> expand the simple last transaction status in
> transam.c but this check happens too late in the critical visibility
> routines to provide much benefit. For the fastest possible cache
> performance with the least amount of impact on scan heavy cases, you have
> to maintain the cache here if you want to avoid dirtying the page
> based on interaction with the cache."
> We call TransactionIdIsKnownCompleted() before we touch shared memory
> in TransactionIdIsInProgress(), which seems early enough for me.
> Maybe I've misunderstood your comments.
This is correct. If you are scanning tuples all with the same
transaction, the transam.c cache will prevent some of the worst
problems. However, going through all the mechanics in the transam.c
covered case is still fairly expensive. You have to call several
non-inlined functions, including XLogNeedsFlush(), over and over.
This is more expensive than it looks and a transam.c cache
implementation is hamstrung out of the gate if you are trying avoid
i/o...it isn't really all that much cheaper than jumping out to the
slru and will penalize repetitive scans (I can prove this with
benchmarks). Bear in mind I started out with the intent of working in
transam.c, and dropped it when it turned out not to work.
> Also, the reason we set the hint bits is (in my understanding) so that
> other users will avoid having to do clog lookups. If we only cache xid
> status locally, we would actually generate more clog lookups, not
> less. I realise that reduces I/O, so it seems to be a straight
> tradeoff between I/O and clog access.
That is not correct. Any cache 'miss' on a page continues to fall
through to SetHintBitsCache() which dirties the page as it always has.
This is a key innovation the patch IMNSHO. Your worst case is the
old behavior -- the maximum number of times the fault to clog can
happen for a page is once. It's impossible to even get one more clog
access than you did with stock postgres.
> For me, it makes sense for us to set hint bits on already-dirty pages
> when they are written out by the bgwriter. At the moment we do very
> little to amortise the I/O that must already happen.
I could agree with this. However if the bgwriter is forcing out pages
before the hint bits can be set you have a problem. Also adding more
work to the bgwriter may cause other secondary performance issues.
In response to
pgsql-hackers by date
|Next:||From: Merlin Moncure||Date: 2011-06-29 15:11:24|
|Subject: Re: Process local hint bit cache|
|Previous:||From: Alvaro Herrera||Date: 2011-06-29 15:03:36|
|Subject: Re: [COMMITTERS] pgsql: Branch refs/heads/REL9_1_STABLE was removed|