faster version of AllocSetFreeIndex for x86 architecture

From: Atsushi Ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Subject: faster version of AllocSetFreeIndex for x86 architecture
Date: 2009-06-02 14:53:36
Message-ID: 4A253CF0.1000702@hi-ho.ne.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hi,
I made a faster version of AllocSetFreeIndex for x86 architecture.

Attached files are benchmark programs and patch file.

alloc_test.pl: benchmark script
alloc_test.c: benchmark program
aset_free_index.patch: patch for util/mmgr/aset.c

This benchmark compares the original function with a faster version.
To try the benchmark, only execute alloc_test.pl. This script compiles
alloc_test.c and execute the benchmark.

Results of benchmark script:
Xeon(Core architecture), RedHat EL4, gcc 3.4.6
bytes : 4 8 16 32 64 128 256 512 1024 mix
original: 0.780 0.780 0.820 0.870 0.930 0.970 1.030 1.080 1.130 0.950
patched : 0.380 0.170 0.170 0.170 0.170 0.180 0.170 0.180 0.180 0.280

Core2, Windows XP, gcc 3.4.4 (cygwin)
bytes : 4 8 16 32 64 128 256 512 1024 mix
original: 0.249 0.249 0.515 0.452 0.577 0.671 0.796 0.890 0.999 1.577
patched : 0.358 0.218 0.202 0.218 0.218 0.218 0.202 0.218 0.218 0.218

Xeon(Pentium4 architecture), RedHal EL4, gcc 3.4.6
bytes : 4 8 16 32 64 128 256 512 1024 mix
original: 0.510 0.520 0.620 0.860 0.970 1.260 1.150 1.220 1.290 0.860
patched : 0.620 0.530 0.530 0.540 0.540 0.530 0.540 0.530 0.530 0.490

The effect of the patch that I measured by oprofile is:
- test program: pgbench -c 1 -t 50000 (fsync=off)

original:
CPU: P4 / Xeon with 2 hyper-threads, speed 2793.55 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events
with a unit mask of 0x01 (mandatory) count 100000
samples % symbol name
66854 6.6725 AllocSetAlloc
47679 4.7587 base_yyparse
29058 2.9002 hash_search_with_hash_value
22053 2.2011 SearchCatCache
19264 1.9227 MemoryContextAllocZeroAligned
16223 1.6192 base_yylex
13819 1.3792 ScanKeywordLookup
13305 1.3279 expression_tree_walker
12144 1.2121 LWLockAcquire
11850 1.1827 XLogInsert
11817 1.1794 AllocSetFree

patched:
CPU: P4 / Xeon with 2 hyper-threads, speed 2793.55 MHz (estimated)
Counted GLOBAL_POWER_EVENTS events
with a unit mask of 0x01 (mandatory) count 100000
samples % symbol name
47610 4.9333 AllocSetAlloc
47441 4.9158 base_yyparse
28243 2.9265 hash_search_with_hash_value
22197 2.3000 SearchCatCache
18984 1.9671 MemoryContextAllocZeroAligned
15747 1.6317 base_yylex
13368 1.3852 ScanKeywordLookup
12889 1.3356 expression_tree_walker
12092 1.2530 LWLockAcquire
12078 1.2515 XLogInsert
(skip)
6248 0.6474 AllocSetFree

I think this patch improves AllocSetAlloc/AllocSetFree performance.

Best regards,

---
Atsushi Ogawa
a_ogawa(at)hi-ho(dot)ne(dot)jp

Attachment Content-Type Size
alloc_test.pl text/plain 729 bytes
alloc_test.c text/plain 1.6 KB
aset_free_index.patch text/plain 968 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-06-02 14:56:59 Re: explain analyze rows=%.0f
Previous Message Aidan Van Dyk 2009-06-02 14:50:00 Re: PostgreSQL Developer meeting minutes up