Skip site navigation (1) Skip section navigation (2)

Re: FunctionCallN improvement.

From: Darcy Buskermolen <darcy(at)wavefire(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp>, Neil Conway <neilc(at)samurai(dot)com>,pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FunctionCallN improvement.
Date: 2005-02-01 22:10:35
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
On February 1, 2005 01:23 pm, Tom Lane wrote:
> a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp> writes:
> > I made the test program to measure the effect of this macro.
> Well, if we're going to be tense about this, let's actually be tense
> about it.  Your test program isn't a great model for what's going to
> happen in fmgr.c, because you've designed it so that Nargs cannot be
> known at compile time.  In the fmgr routines, Nargs is certainly a
> compile-time constant, and so implementations that can exploit that
> will have an advantage.
> Also, we can take advantage of some improvements in the MemSet macro
> family that occurred since fmgr.c was last rewritten.  I see no reason
> not to use MemSetLoop directly, since the fcinfo struct will have the
> correct size and correct alignment.
> In addition to your original macro, I tried two other variants: one
> that uses MemSetLoop with a loop length rounded to the next higher
> multiple of 4, and one that expects the argisnull settings to be written
> out directly, in the same style as is currently done in FunctionCall1
> and FunctionCall2.  (This amounts to unrolling the loop in the original
> macro; something that could be done by the compiler given a constant
> Nargs, but it seems not to be done by the compilers I tested.)
> I tested two cases: NARGS = 2, which is certainly the single most
> critical case, and NARGS = 5, which is probably the largest number
> of arguments that we really care too much about.  (You have to hand-edit
> the test program and recompile to adjust NARGS, since the point is to
> treat it as a compile-time constant.)
> Here are wall-clock timings on the architectures and compilers I have at
> hand:
> NARGS = 2
> 		MemSetLoop	OrigMacro	SetMacro	Unrolled
> i386, gcc -O2	37.655s		6.411s		7.060s		6.362s
> i386, gcc -O6	35.420s		1.129s		1.814s		0.567s
> PPC, gcc -O2	54.033s		6.754s		11.138s		6.438s
> HPPA, gcc -O2	58.82s		10.38s		9.79s		7.85s
> HPPA, cc +O2	60.39s		13.43s		8.40s		7.31s
> NARGS = 5
> 		MemSetLoop	OrigMacro	SetMacro	Unrolled
> i386, gcc -O2	37.566s		11.329s		7.688s		8.874s
> i386, gcc -O6	32.992s		5.928s		2.881s		0.566s
> PPC, gcc -O2	86.300s		19.048s		14.626s		8.751s
> HPPA, gcc -O2	58.28s		15.09s		13.42s		14.37s
> HPPA, cc +O2	58.23s		8.96s		12.88s		7.28s

I see simular comparitive times on an UltraSparc running Solaris.

> (I used different loop counts on the different machines to get similar
> overall times for the memset case; so it's OK to compare numbers across
> a row but not down a column.)
> Based on this I think we ought to go with the "unrolled" approach, ie,
> we'll create a macro to initialize the fixed fields of fcinfo but fill
> in the arg and argisnull arrays with code like what's already in
> FunctionCall2:
> 	fcinfo.arg[0] = arg1;
> 	fcinfo.arg[1] = arg2;
> 	fcinfo.argnull[0] = false;
> 	fcinfo.argnull[1] = false;
> If anyone would like to try the results on other platforms, my test
> program is attached.
> 			regards, tom lane

Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2005-02-01 22:13:52
Subject: Re: float4 regression test failed on linux parisc
Previous:From: Jim ButtafuocoDate: 2005-02-01 22:06:40
Subject: Re: float4 regression test failed on linux parisc

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group