Re: inline newNode()

From: Neil Conway <neilc(at)samurai(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: inline newNode()
Date: 2002-10-10 06:51:16
Message-ID: 87ptuin5wb.fsf@mailbox.samurai.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Remember, MemSet was invented only to prevent function call overhead,
> and on my BSD/OS system, len >= 256 is faster with the libc
> memset().

Yes, I remember finding that when testing MemSet() versus memset() for
various values of MEMSET_LOOP_LIMIT earlier.

> What really surprised me is that MemSet won on Sparc, where they have an
> assembler language version that looks very similar to the MemSet
> loop.

Well, I'd assume any C library / compiler of half-decent quality on
any platform would provide assembly optimized versions of common
stdlib functions like memset().

While playing around with memset() on my machine (P4 running Linux,
glibc 2.2.5, GCC 3.2.1pre3), I found the following interesting
result. I used this simple benchmark (the same one I posted for the
earlier MemSet() thread on -hackers):

#include <string.h>
#include "postgres.h"

#undef MEMSET_LOOP_LIMIT
#define MEMSET_LOOP_LIMIT BUFFER_SIZE

int
main(void)
{
char buffer[BUFFER_SIZE];
long long i;

for (i = 0; i < 99000000; i++)
{
memset(buffer, 0, sizeof(buffer));
}

return 0;
}

Compiled with '-DBUFFER_SIZE=256 -O2', I get the following results in
seconds:

MemSet(): ~9.6
memset(): ~19.5
__builtin_memset(): ~10.00

So it seems there is a reasonably optimized version of memset()
provided by glibc/GCC (not sure which :-) ), it's just a matter of
persuading the compiler to let us use it. It's still depressing that
it doesn't beat MemSet(), but perhaps __builtin_memset() has better
average-case performane over a wider spectrum of memory size?[1]

BTW, regarding the newNode() stuff: so is it agreed that Bruce's patch
is a performance win without too high of a code bloat / uglification
penalty? If so, is it 7.3 or 7.4 material?

Cheers,

Neil

[1] Not that I really buy that -- for one thing, if the length is
constant, as it is in this case, the compiler can substitute an
optimized version of the function for the appropriate memory size. I'm
having a little difficulty explaining GCC/glibc's poor performance...

--
Neil Conway <neilc(at)samurai(dot)com> || PGP Key ID: DB3C29FC

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2002-10-10 07:05:26 Bison 1.50 was released
Previous Message Tom Lane 2002-10-10 05:05:24 Re: GRANT on functions/languages

Browse pgsql-patches by date

  From Date Subject
Next Message Karel Zak 2002-10-10 07:22:16 Re: inline newNode()
Previous Message Bruce Momjian 2002-10-10 04:56:04 Re: inline newNode()