Re: Process local hint bit cache

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Process local hint bit cache
Date: 2011-03-31 22:38:33
Message-ID: AANLkTimpUbs=bCHGN9oCV9oug55b5DUw4DaCuAXwXivM@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 31, 2011 at 5:33 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Mar 30, 2011 at 2:35 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Wed, Mar 30, 2011 at 11:23 AM, Heikki Linnakangas
>> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> On 30.03.2011 18:02, Robert Haas wrote:
>>>>
>>>> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark(at)mit(dot)edu>  wrote:
>>>>>
>>>>> But one way or another the hint bits have to get set sometime. The
>>>>> sooner that happens the less clog i/o has to happen in the meantime.
>>>>
>>>> I talked about this with Merlin a bit yesterday.  I think that his
>>>> thought is that most transactions will access a small enough number of
>>>> distinct CLOG pages, and the cache accesses might be fast enough, that
>>>> we really wouldn't need to get the hint bits set, or at least that
>>>> vacuum/freeze time would be soon enough.  I'm not sure if that's
>>>> actually true, though - I think the overhead of the cache might be
>>>> higher than he's imagining.  However, there's a sure-fire way to find
>>>> out... code it up and see how it plays.
>>>
>>> I did a little experiment: I hacked SetHintBits() to do nothing, so that
>>> hint bits are never set. I then created a table with 100000 rows in it:
>>>
>>> postgres=# \d foo
>>>      Table "public.foo"
>>>  Column |  Type   | Modifiers
>>> --------+---------+-----------
>>>  a      | integer |
>>>
>>> postgres=# INSERT INTO foo SELECT generate_series(1, 100000);
>>> INSERT 0 100000
>>>
>>> And ran queries on the table:
>>>
>>> postgres=# do $$
>>> declare
>>>  i int4;
>>> begin
>>>  loop
>>>    perform COUNT(*) FROM foo;
>>>  end loop;
>>> end;
>>> $$;
>>>
>>> This test case is designed to exercise the visibility tests as much as
>>> possible. However, all the tuples are loaded in one transaction, so the
>>> one-element cache in TransactionLogFetch is 100% effective.
>>>
>>> I ran oprofile on that. It shows that about 15% of the time is spent in
>>> HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in
>>> HeapTupleSatisfiesMVCC itself. Here's the breakdown of that:
>>>
>>> $ opreport  -c --global-percent
>>>
>>> CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated)
>>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
>>> mask of 0x00 (No unit mask) count 100000
>>> samples  %        app name                 symbol name
>>> ...
>>> -------------------------------------------------------------------------------
>>>  2143      0.4419  postgres                 postgres heapgettup_pagemode
>>>  73277    15.1091  postgres                 postgres heapgetpage
>>> 31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>>>  31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>>> [self]
>>>  12809     2.6411  postgres                 postgres
>>> TransactionIdIsInProgress
>>>  12091     2.4931  postgres                 postgres XidInMVCCSnapshot
>>>  7150      1.4743  postgres                 postgres
>>> TransactionIdIsCurrentTransactionId
>>>  7056      1.4549  postgres                 postgres TransactionIdDidCommit
>>>  1839      0.3792  postgres                 postgres TransactionIdPrecedes
>>>  1467      0.3025  postgres                 postgres SetHintBits
>>>  1155      0.2382  postgres                 postgres TransactionLogFetch
>>> -------------------------------------------------------------------------------
>>> ...
>>>
>>> I then ran the same test again with an unpatched version, to set the hint
>>> bits. After the hint bits were set, I ran oprofile again:
>>>
>>> -------------------------------------------------------------------------------
>>>  275       0.4986  postgres                 heapgettup_pagemode
>>>  4459      8.0851  postgres                 heapgetpage
>>> 3005      5.4487  postgres                 HeapTupleSatisfiesMVCC
>>>  3005      5.4487  postgres                 HeapTupleSatisfiesMVCC [self]
>>>  1620      2.9374  postgres                 XidInMVCCSnapshot
>>>  110       0.1995  postgres                 TransactionIdPrecedes
>>> -------------------------------------------------------------------------------
>>>
>>> So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the total
>>> CPU time, and without hint bits, 15%.
>>>
>>> Even if clog access was free, hint bits still give a significant speedup
>>> thanks to skipping all the other overhead like TransactionIdIsInProgress()
>>> and TransactionIdIsCurrentTransactionId(). Speeding up clog access is
>>> important; when the one-element cache isn't saving you the clog access
>>> becomes a dominant factor. But you have to address that other overhead too
>>> before you can get rid of hint bits.
>>
>> Here is a patch demonstrating the caching action, but without the
>> cache table, which isn't done yet -- It only works using the very last
>> transaction id fetched.  I used macros so I could keep the changes
>> quite localized.
>
> While playing with the data structure to implement an xid cache inside
> TransactionLogFetch, I learned a nasty lesson.
> HeapTupleSatisifiesMVCC is super sensitive to any changes and even
> jumping through a couple of calls to TransactionLogFetch was
> measurable in a couple of performance test case I came up with.  I
> hauntingly recalled Tom's warnings to beware unexpected performance
> downsides.
>
> In short, irregardless of implementation, I unfortunately don't think
> any caching as or under the clog abstraction is going to work without
> impacting at least some cases.
>
> OTOH, maybe some micro optimization of HeapTupleSatisifiesMVCC could
> provide the same benefit I'm going after here.  If you save off the
> the last xid that came back committed in a static variable, you can
> check that at the same level as if it were a hint bit.  From my
> testing, this is well and truly 'free' in the sense that it can
> defer/eliminate clog checks and hint bit i/o.
>
> The patch attached gives all the advantages I was after in the

hm, this isn't quite right -- if you get a 'last xid cache hit', the
hint bit tuples should be set if they are not already (but the page
should not be dirtied).

working on exanding the cache to # xid > 1.

merlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2011-03-31 22:39:24 Re: cast from integer to money
Previous Message Merlin Moncure 2011-03-31 22:33:02 Re: Process local hint bit cache