Skip site navigation (1) Skip section navigation (2)

Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pg1(at)thetdh(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH v4] Avoid manual shift-and-test logic in AllocSetFreeIndex
Date: 2009-07-21 16:07:06
Message-ID: 14383.1248192426@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
pg(at)thetdh(dot)com writes:
>  Normally I'd try a small lookup table (1-byte index to 1-byte value)
>  in this case. But if the bitscan instruction were even close in
>  performance, it'd be preferable, due to its more-reliable caching
>  behavior;

Well, the problem with the bitscan instruction is we won't have it at
all on a lot of configurations.  I don't think this problem is so
important as to justify *two* hacked-up code paths.

> The specific code for large-versus-small testing would be useful; did I overlook it?

Sorry, I should have provided that.  What I've tested is


#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n

		static const char LogTable256[256] = 
			{
				0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
				LT(5), LT(6), LT(6), LT(7), LT(7), LT(7), LT(7),
				LT(8), LT(8), LT(8), LT(8), LT(8), LT(8), LT(8), LT(8)
			};


int
AllocSetFreeIndex_lt(size_t size)
{
	int                     idx = 0;

	if (size > (1 << ALLOC_MINBITS))
	{
		unsigned int t, tt; // temporaries

		size = (size - 1) >> ALLOC_MINBITS;

		if ((tt = (size >> 16)))
		{
			idx = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
		}
		else 
		{
			idx = (t = size >> 8) ? 8 + LogTable256[t] : LogTable256[size];
		}

	}

	return idx;
}

int
AllocSetFreeIndex_lts(size_t size)
{
	int                     idx = 0;

	if (size > (1 << ALLOC_MINBITS))
	{
		unsigned int t; // temporaries

		size = (size - 1) >> ALLOC_MINBITS;

		t = size >> 8;
		idx = t ? 8 + LogTable256[t] : LogTable256[(unsigned int) size];
	}

	return idx;
}

plus the obvious additions to the other file for an additional test
type.

> Note that instruction alignment with respect to words is not the only
> potential instruction-alignment issue. In the past, when optimizing
> code to an extreme, I've run into cache-line issues where a small
> change that should've produced a small improvement resulted in a
> largish performance loss, without further work. Lookup tables can have
> an analogous issue; this could, in a simplistic test, explain an
> anomalous large-better-than-small result, if part of the large lookup
> table remains cached. (Do any modern CPUs attempt to address this??)

Yeah, I've seen similar things when trying to micro-optimize code.
That's why it's so critical to get results from multiple
platforms/machines before trusting the answers too much.

> This is difficult to tune in a multiplatform code base, so the numbers
> in a particular benchmark do not tell the whole tale; you'd need to
> make a judgment call, and perhaps to allow a code-configuration
> override.

Well, my judgment call is that we should go with the lookup table.
So far, the only machine where that doesn't seem to provide a nice win
is my old Mac G4 laptop, which is hardly a platform that people are
likely to be stressing about database performance on.  Jeremy's PPC
machines like it a lot more, so there's not something fundamentally
wrong with the approach for that architecture.  The other alternatives
all have significantly greater downsides.

			regards, tom lane

In response to

pgsql-hackers by date

Next:From: Alexey KlyukinDate: 2009-07-21 16:15:29
Subject: Re: errcontext support in PL/Perl
Previous:From: Alvaro HerreraDate: 2009-07-21 15:39:47
Subject: Re: errcontext support in PL/Perl

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group