Re: FunctionCallN improvement.

From: Darcy Buskermolen <darcy(at)wavefire(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp>, Neil Conway <neilc(at)samurai(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FunctionCallN improvement.
Date: 2005-02-01 22:10:35
Message-ID: 200502011410.35048.darcy@wavefire.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On February 1, 2005 01:23 pm, Tom Lane wrote:
> a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp> writes:
> > I made the test program to measure the effect of this macro.
>
> Well, if we're going to be tense about this, let's actually be tense
> about it. Your test program isn't a great model for what's going to
> happen in fmgr.c, because you've designed it so that Nargs cannot be
> known at compile time. In the fmgr routines, Nargs is certainly a
> compile-time constant, and so implementations that can exploit that
> will have an advantage.
>
> Also, we can take advantage of some improvements in the MemSet macro
> family that occurred since fmgr.c was last rewritten. I see no reason
> not to use MemSetLoop directly, since the fcinfo struct will have the
> correct size and correct alignment.
>
> In addition to your original macro, I tried two other variants: one
> that uses MemSetLoop with a loop length rounded to the next higher
> multiple of 4, and one that expects the argisnull settings to be written
> out directly, in the same style as is currently done in FunctionCall1
> and FunctionCall2. (This amounts to unrolling the loop in the original
> macro; something that could be done by the compiler given a constant
> Nargs, but it seems not to be done by the compilers I tested.)
>
> I tested two cases: NARGS = 2, which is certainly the single most
> critical case, and NARGS = 5, which is probably the largest number
> of arguments that we really care too much about. (You have to hand-edit
> the test program and recompile to adjust NARGS, since the point is to
> treat it as a compile-time constant.)
>
> Here are wall-clock timings on the architectures and compilers I have at
> hand:
>
> NARGS = 2
> MemSetLoop OrigMacro SetMacro Unrolled
>
> i386, gcc -O2 37.655s 6.411s 7.060s 6.362s
>
> i386, gcc -O6 35.420s 1.129s 1.814s 0.567s
>
> PPC, gcc -O2 54.033s 6.754s 11.138s 6.438s
>
> HPPA, gcc -O2 58.82s 10.38s 9.79s 7.85s
>
> HPPA, cc +O2 60.39s 13.43s 8.40s 7.31s
>
> NARGS = 5
> MemSetLoop OrigMacro SetMacro Unrolled
>
> i386, gcc -O2 37.566s 11.329s 7.688s 8.874s
>
> i386, gcc -O6 32.992s 5.928s 2.881s 0.566s
>
> PPC, gcc -O2 86.300s 19.048s 14.626s 8.751s
>
> HPPA, gcc -O2 58.28s 15.09s 13.42s 14.37s
>
> HPPA, cc +O2 58.23s 8.96s 12.88s 7.28s

I see simular comparitive times on an UltraSparc running Solaris.

>
> (I used different loop counts on the different machines to get similar
> overall times for the memset case; so it's OK to compare numbers across
> a row but not down a column.)
>
> Based on this I think we ought to go with the "unrolled" approach, ie,
> we'll create a macro to initialize the fixed fields of fcinfo but fill
> in the arg and argisnull arrays with code like what's already in
> FunctionCall2:
>
> fcinfo.arg[0] = arg1;
> fcinfo.arg[1] = arg2;
> fcinfo.argnull[0] = false;
> fcinfo.argnull[1] = false;
>
> If anyone would like to try the results on other platforms, my test
> program is attached.
>
> regards, tom lane

--
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx: 250.763.1759
http://www.wavefire.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-02-01 22:13:52 Re: float4 regression test failed on linux parisc
Previous Message Jim Buttafuoco 2005-02-01 22:06:40 Re: float4 regression test failed on linux parisc