Skip site navigation (1) Skip section navigation (2)

Re: FunctionCallN improvement.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp>
Cc: Neil Conway <neilc(at)samurai(dot)com>,pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: FunctionCallN improvement.
Date: 2005-02-01 21:23:56
Message-ID: 19054.1107293036@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
a_ogawa <a_ogawa(at)hi-ho(dot)ne(dot)jp> writes:
> I made the test program to measure the effect of this macro. 

Well, if we're going to be tense about this, let's actually be tense
about it.  Your test program isn't a great model for what's going to
happen in fmgr.c, because you've designed it so that Nargs cannot be
known at compile time.  In the fmgr routines, Nargs is certainly a
compile-time constant, and so implementations that can exploit that
will have an advantage.

Also, we can take advantage of some improvements in the MemSet macro
family that occurred since fmgr.c was last rewritten.  I see no reason
not to use MemSetLoop directly, since the fcinfo struct will have the
correct size and correct alignment.

In addition to your original macro, I tried two other variants: one
that uses MemSetLoop with a loop length rounded to the next higher
multiple of 4, and one that expects the argisnull settings to be written
out directly, in the same style as is currently done in FunctionCall1
and FunctionCall2.  (This amounts to unrolling the loop in the original
macro; something that could be done by the compiler given a constant
Nargs, but it seems not to be done by the compilers I tested.)

I tested two cases: NARGS = 2, which is certainly the single most
critical case, and NARGS = 5, which is probably the largest number
of arguments that we really care too much about.  (You have to hand-edit
the test program and recompile to adjust NARGS, since the point is to
treat it as a compile-time constant.)

Here are wall-clock timings on the architectures and compilers I have at
hand:

NARGS = 2
		MemSetLoop	OrigMacro	SetMacro	Unrolled

i386, gcc -O2	37.655s		6.411s		7.060s		6.362s

i386, gcc -O6	35.420s		1.129s		1.814s		0.567s

PPC, gcc -O2	54.033s		6.754s		11.138s		6.438s

HPPA, gcc -O2	58.82s		10.38s		9.79s		7.85s

HPPA, cc +O2	60.39s		13.43s		8.40s		7.31s

NARGS = 5
		MemSetLoop	OrigMacro	SetMacro	Unrolled

i386, gcc -O2	37.566s		11.329s		7.688s		8.874s

i386, gcc -O6	32.992s		5.928s		2.881s		0.566s

PPC, gcc -O2	86.300s		19.048s		14.626s		8.751s

HPPA, gcc -O2	58.28s		15.09s		13.42s		14.37s

HPPA, cc +O2	58.23s		8.96s		12.88s		7.28s

(I used different loop counts on the different machines to get similar
overall times for the memset case; so it's OK to compare numbers across
a row but not down a column.)

Based on this I think we ought to go with the "unrolled" approach, ie,
we'll create a macro to initialize the fixed fields of fcinfo but fill
in the arg and argisnull arrays with code like what's already in
FunctionCall2:

	fcinfo.arg[0] = arg1;
	fcinfo.arg[1] = arg2;
	fcinfo.argnull[0] = false;
	fcinfo.argnull[1] = false;

If anyone would like to try the results on other platforms, my test
program is attached.

			regards, tom lane

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2005-02-01 21:26:02
Subject: Re: Huge memory consumption during vacuum (v.8.0)
Previous:From: Oleg BartunovDate: 2005-02-01 20:11:15
Subject: Re: Huge memory consumption during vacuum (v.8.0)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group