Skip site navigation (1) Skip section navigation (2)

Re: inline newNode()

From: Neil Conway <neilc(at)samurai(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>,PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: inline newNode()
Date: 2002-10-10 06:51:16
Message-ID: 87ptuin5wb.fsf@mailbox.samurai.com (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> Remember, MemSet was invented only to prevent function call overhead,
> and on my BSD/OS system, len >= 256 is faster with the libc
> memset(). 

Yes, I remember finding that when testing MemSet() versus memset() for
various values of MEMSET_LOOP_LIMIT earlier.

> What really surprised me is that MemSet won on Sparc, where they have an
> assembler language version that looks very similar to the MemSet
> loop.

Well, I'd assume any C library / compiler of half-decent quality on
any platform would provide assembly optimized versions of common
stdlib functions like memset().

While playing around with memset() on my machine (P4 running Linux,
glibc 2.2.5, GCC 3.2.1pre3), I found the following interesting
result. I used this simple benchmark (the same one I posted for the
earlier MemSet() thread on -hackers):

#include <string.h>
#include "postgres.h"

#undef MEMSET_LOOP_LIMIT
#define MEMSET_LOOP_LIMIT BUFFER_SIZE

int
main(void)
{
	char buffer[BUFFER_SIZE];
	long long i;

	for (i = 0; i < 99000000; i++)
	{
		memset(buffer, 0, sizeof(buffer));
	}

	return 0;
}

Compiled with '-DBUFFER_SIZE=256 -O2', I get the following results in
seconds:

MemSet(): ~9.6
memset(): ~19.5
__builtin_memset(): ~10.00

So it seems there is a reasonably optimized version of memset()
provided by glibc/GCC (not sure which :-) ), it's just a matter of
persuading the compiler to let us use it. It's still depressing that
it doesn't beat MemSet(), but perhaps __builtin_memset() has better
average-case performane over a wider spectrum of memory size?[1]

BTW, regarding the newNode() stuff: so is it agreed that Bruce's patch
is a performance win without too high of a code bloat / uglification
penalty? If so, is it 7.3 or 7.4 material?

Cheers,

Neil

[1] Not that I really buy that -- for one thing, if the length is
constant, as it is in this case, the compiler can substitute an
optimized version of the function for the appropriate memory size. I'm
having a little difficulty explaining GCC/glibc's poor performance...

-- 
Neil Conway <neilc(at)samurai(dot)com> || PGP Key ID: DB3C29FC


In response to

Responses

pgsql-hackers by date

Next:From: Michael MeskesDate: 2002-10-10 07:05:26
Subject: Bison 1.50 was released
Previous:From: Tom LaneDate: 2002-10-10 05:05:24
Subject: Re: GRANT on functions/languages

pgsql-patches by date

Next:From: Karel ZakDate: 2002-10-10 07:22:16
Subject: Re: inline newNode()
Previous:From: Bruce MomjianDate: 2002-10-10 04:56:04
Subject: Re: inline newNode()

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group