Re: Process local hint bit cache

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Process local hint bit cache
Date: 2011-03-30 16:23:42
Message-ID: 4D93590E.1030905@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 30.03.2011 18:02, Robert Haas wrote:
> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
>> But one way or another the hint bits have to get set sometime. The
>> sooner that happens the less clog i/o has to happen in the meantime.
>
> I talked about this with Merlin a bit yesterday. I think that his
> thought is that most transactions will access a small enough number of
> distinct CLOG pages, and the cache accesses might be fast enough, that
> we really wouldn't need to get the hint bits set, or at least that
> vacuum/freeze time would be soon enough. I'm not sure if that's
> actually true, though - I think the overhead of the cache might be
> higher than he's imagining. However, there's a sure-fire way to find
> out... code it up and see how it plays.

I did a little experiment: I hacked SetHintBits() to do nothing, so that
hint bits are never set. I then created a table with 100000 rows in it:

postgres=# \d foo
Table "public.foo"
Column | Type | Modifiers
--------+---------+-----------
a | integer |

postgres=# INSERT INTO foo SELECT generate_series(1, 100000);
INSERT 0 100000

And ran queries on the table:

postgres=# do $$
declare
i int4;
begin
loop
perform COUNT(*) FROM foo;
end loop;
end;
$$;

This test case is designed to exercise the visibility tests as much as
possible. However, all the tuples are loaded in one transaction, so the
one-element cache in TransactionLogFetch is 100% effective.

I ran oprofile on that. It shows that about 15% of the time is spent in
HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in
HeapTupleSatisfiesMVCC itself. Here's the breakdown of that:

$ opreport -c --global-percent

CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
unit mask of 0x00 (No unit mask) count 100000
samples % app name symbol name
...
-------------------------------------------------------------------------------
2143 0.4419 postgres postgres
heapgettup_pagemode
73277 15.1091 postgres postgres
heapgetpage
31858 6.5688 postgres postgres
HeapTupleSatisfiesMVCC
31858 6.5688 postgres postgres
HeapTupleSatisfiesMVCC [self]
12809 2.6411 postgres postgres
TransactionIdIsInProgress
12091 2.4931 postgres postgres
XidInMVCCSnapshot
7150 1.4743 postgres postgres
TransactionIdIsCurrentTransactionId
7056 1.4549 postgres postgres
TransactionIdDidCommit
1839 0.3792 postgres postgres
TransactionIdPrecedes
1467 0.3025 postgres postgres
SetHintBits
1155 0.2382 postgres postgres
TransactionLogFetch
-------------------------------------------------------------------------------
...

I then ran the same test again with an unpatched version, to set the
hint bits. After the hint bits were set, I ran oprofile again:

-------------------------------------------------------------------------------
275 0.4986 postgres heapgettup_pagemode
4459 8.0851 postgres heapgetpage
3005 5.4487 postgres HeapTupleSatisfiesMVCC
3005 5.4487 postgres HeapTupleSatisfiesMVCC [self]
1620 2.9374 postgres XidInMVCCSnapshot
110 0.1995 postgres TransactionIdPrecedes
-------------------------------------------------------------------------------

So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the
total CPU time, and without hint bits, 15%.

Even if clog access was free, hint bits still give a significant speedup
thanks to skipping all the other overhead like
TransactionIdIsInProgress() and TransactionIdIsCurrentTransactionId().
Speeding up clog access is important; when the one-element cache isn't
saving you the clog access becomes a dominant factor. But you have to
address that other overhead too before you can get rid of hint bits.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Adrian Klaver 2011-03-30 16:26:25 Re: Date conversion using day of week
Previous Message Steve Crawford 2011-03-30 16:15:30 Re: Date conversion using day of week