Re: New to PostgreSQL, performance considerations

From: Ron <rjpeace(at)earthlink(dot)net>
To: Greg Smith <gsmith(at)gregsmith(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: New to PostgreSQL, performance considerations
Date: 2006-12-15 16:53:28
Message-ID: E1GvGJR-0006HJ-OQ@elasmtp-mealy.atl.sa.earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At 09:50 AM 12/15/2006, Greg Smith wrote:
>On Fri, 15 Dec 2006, Merlin Moncure wrote:
>
>>The slower is probably due to the unroll loops switch which can
>>actually hurt code due to the larger footprint (less cache coherency).
>
>The cache issues are so important with current processors that I'd
>suggest throwing -Os (optimize for size) into the mix people
>test. That one may stack usefully with -O2, but probably not with
>-O3 (3 includes optimizations that increase code size).

-Os
Optimize for size. -Os enables all -O2 optimizations that do not
typically increase code size. It also performs further optimizations
designed to reduce code size.

-Os disables the following optimization flags:
-falign-functions -falign-jumps -falign-loops -falign-labels
-freorder-blocks -freorder-blocks-and-partition
-fprefetch-loop-arrays
-ftree-vect-loop-version

Hmmm. That list of disabled flags bears thought.

-falign-functions -falign-jumps -falign-loops -falign-labels

1= Most RISC CPUs performance is very sensitive to misalignment
issues. Not recommended to turn these off.

-freorder-blocks
Reorder basic blocks in the compiled function in order to reduce
number of taken branches and improve code locality.

Enabled at levels -O2, -O3.
-freorder-blocks-and-partition
In addition to reordering basic blocks in the compiled function, in
order to reduce number of taken branches, partitions hot and cold
basic blocks into separate sections of the assembly and .o files, to
improve paging and cache locality performance.

This optimization is automatically turned off in the presence of
exception handling, for link once sections, for functions with a
user-defined section attribute and on any architecture that does not
support named sections.

2= Most RISC CPUs are cranky about branchy code and (lack of) cache
locality. Wouldn't suggest punting these either.

-fprefetch-loop-arrays
If supported by the target machine, generate instructions to prefetch
memory to improve the performance of loops that access large arrays.

This option may generate better or worse code; results are highly
dependent on the structure of loops within the source code.

3= OTOH, This one looks worth experimenting with turning off.

-ftree-vect-loop-version
Perform loop versioning when doing loop vectorization on trees. When
a loop appears to be vectorizable except that data alignment or data
dependence cannot be determined at compile time then vectorized and
non-vectorized versions of the loop are generated along with runtime
checks for alignment or dependence to control which version is
executed. This option is enabled by default except at level -Os where
it is disabled.

4= ...and this one looks like a 50/50 shot.

Ron Peacetree

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ron 2006-12-15 16:55:52 Re: [HACKERS] EXPLAIN ANALYZE on 8.2
Previous Message Bruno Wolff III 2006-12-15 16:44:39 Re: File Systems Compared