Re: [PATCHES] MemSet inline for newNode

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] MemSet inline for newNode
Date: 2002-11-10 23:07:34
Message-ID: 200211102307.gAAN7Y611251@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Further update:

I ran a test and compared the sizes of the postgres binary: 2926925
with palloc0, 2931293 without, a 0.14% decrease. Not very much,
considering the 1-14% speedup of MemSet with a constant.

Tom, I really didn't think those conditional tests would be significant,
but clearly they are --- call me surprised.

Back out palloc0()?

---------------------------------------------------------------------------

Bruce Momjian wrote:
>
> OK, my compiler, gcc 2.95 does the optimization at -O2, so I ran the
> tests here with the attached program, testing a constant of 256 and a
> variable 'j' equal to 256.
>
> I found a 1.7% slowdown with the variable compared to the constant.
> The 256 value is rather large. For a length of 32, the difference was
> 14%!
>
> So, seems I should back out the palloc0 usage and let the MemSet inline
> at each location. Does that seems like the direction to head?
>
> ---------------------------------------------------------------------------
>
> Bruce Momjian wrote:
> > Tom Lane wrote:
> > > Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > > > I have applied this patch to inline MemSet for newNode.
> > > > I will make another commit to make more general use of palloc0.
> > >
> > > Refresh my memory on why either of these was a good idea?
> >
> > You are talking about the use of palloc0 in place of palloc/MemSet(0),
> > not the use of palloc0 in newNode, right?
> >
> > > > This was already discussed on hackers.
> > >
> > > And this was not the approach agreed to, IIRC. What you've done has
> > > eliminated the possibility of optimizing away the controlling tests
> > > in MemSet.
> >
> > I see now:
> >
> > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&selm=200210120006.g9C06i502964%40candle.pha.pa.us
> >
> > I thought someone had tested that there was little/no performance
> > difference between these two statements:
> >
> > MemSet(ptr, 0, 256)
> >
> > vs.
> >
> > i = 256;
> > MemSet(ptr, 0, i)
> >
> > However, seeing no email reply, I now assume no test was done.
> >
> > Neil, can you run such a test and let us know. It has to be with a
> > compiler that optimizes out the MemSet length test when the len is a
> > constant. I can't remember what compiler version/settings that was, but
> > it may have been -O3 gcc 3.20.
> >
> > I can back out my changes, but it would be easier to see if there is a
> > benefit before doing that. Personally, I am going to be surprised if a
> > single conditional tests in MemSet (which is not in the assignment loop)
> > will make any measurable difference.
> >
> > --
> > Bruce Momjian | http://candle.pha.pa.us
> > pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> > + If your life is a hard drive, | 13 Roberts Road
> > + Christ can be your backup. | Newtown Square, Pennsylvania 19073
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
> + If your life is a hard drive, | 13 Roberts Road
> + Christ can be your backup. | Newtown Square, Pennsylvania 19073

> #define INT_ALIGN_MASK (sizeof(int) - 1)
> #include <stdlib.h>
>
> /*
> * MemSet
> * Exactly the same as standard library function memset(), but considerably
> * faster for zeroing small word-aligned structures (such as parsetree nodes).
> * This has to be a macro because the main point is to avoid function-call
> * overhead. However, we have also found that the loop is faster than
> * native libc memset() on some platforms, even those with assembler
> * memset() functions. More research needs to be done, perhaps with
> * platform-specific MEMSET_LOOP_LIMIT values or tests in configure.
> *
> * bjm 2002-10-08
> */
> #define MemSet(start, val, len) \
> do \
> { \
> int * _start = (int *) (start); \
> int _val = (val); \
> size_t _len = (len); \
> \
> if ((((long) _start) & INT_ALIGN_MASK) == 0 && \
> (_len & INT_ALIGN_MASK) == 0 && \
> _val == 0 && \
> _len <= MEMSET_LOOP_LIMIT) \
> { \
> int * _stop = (int *) ((char *) _start + _len); \
> while (_start < _stop) \
> *_start++ = 0; \
> } \
> else \
> memset((char *) _start, _val, _len); \
> } while (0)
>
> #define MEMSET_LOOP_LIMIT 1024
>
> int
> main(int argc, char **argv)
> {
> int i,j;
> char *buf;
>
> buf = malloc(256);
>
> j = 32;
> for (i = 0; i < 1000000; i++)
> MemSet(buf, 0, j /* choose 256 or j here */);
> return 0;
> }

>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-11-11 02:50:03 Re: MemSet inline for newNode
Previous Message GB Clark 2002-11-10 22:35:26 Ignore prior post --> Where is src/interfaces/perl5 in beta 5?

Browse pgsql-patches by date

  From Date Subject
Next Message Neil Conway 2002-11-10 23:09:09 more SGML fixes
Previous Message Neil Conway 2002-11-10 21:16:25 minor SGML fix