Re: tweaking MemSet() performance - 7.4.5

From: mcolosimo(at)mitre(dot)org
To: Manfred Spraul <manfred(at)colorfullife(dot)com>, Marc Colosimo <mcolosimo(at)mitre(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: tweaking MemSet() performance - 7.4.5
Date: 2004-09-20 22:20:12
Message-ID: 22741566.1095718812574.JavaMail.ipsadmin@emiisrv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>Marc Colosimo wrote:
>
>> Oops, I used the same setting as in the old hacking message (-O2, gcc
>> 3.3). If I understand what you are saying, then it turns out yes, PG's
>> MemSet is faster for smaller blocksizes (see below, between 32 and
>> 64). I just replaced the whole MemSet with memset and it is not very
>> low when I profile.
>
>Could you check what the OS-X memset function does internally?
>One trick to speed up memset it to bypass the cache and bulk-write
>directly from write buffers to main memory. i386 cpus support that and
>in microbenchmarks it's 3 times faster (or something like that).
>Unfortunately it's a loss in real-world tests: Typically a structure is
>initialized with memset and then immediately accessed. If the memset
>bypasses the cache then the following access will cause a cache line
>miss, which can be so slow that using the faster memset can result in a
>net performance loss.
>

Could you suggest some structs to test? If I get your meaning, I would make a loop that sets then reads from the structure.

>> I could squeeze more out of it if I spent more time trying to
>> understand it (change MEMSET_LOOP_LIMIT to 32 and then add memset
>> after that?). I'm now working one understanding Spin Locks and
>> friends. Putting in a sync call (in s_lock.h) is really a time killer
>> and bad for performance (it takes up 35 cycles).
>>
>That's the price you pay for weakly ordered memory access.
>Linux on ppc uses eieio, on ppc64 lwsync is used. Could you check if
>they are faster?
>

I found the reason why "sync" was put in <http://archives.postgresql.org/pgsql-bugs/2002-09/msg00239.php>, but it is odd why it works. Why syncing one processor prevents the other from doing something is interesting. What type of shared memory is being used on OS X? I'm confused about the two types of semaphores, sysV or POSIX. <http://archives.postgresql.org/pgsql-patches/2001-01/msg00052.php>It seems the POSIX is the way to go on OS X.

Marc

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2004-09-20 22:28:12 Re: Disabling bgwriter on my notebook
Previous Message Laszlo Hornyak 2004-09-20 20:27:34 elog in 7.4