Re: New to PostgreSQL, performance considerations

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Michael Stone" <mstone+postgres(at)mathom(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: New to PostgreSQL, performance considerations
Date: 2006-12-11 18:30:55
Message-ID: C1A2E3DF.154B6%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael,

On 12/11/06 9:31 AM, "Michael Stone" <mstone+postgres(at)mathom(dot)us> wrote:

> [1] I will say that I have never seen a realistic benchmark of general
> code where the compiler flags made a statistically significant
> difference in the runtime.

Here's one - I wrote a general purpose Computational Fluid Dynamics analysis
method used by hundreds of people to perform aircraft and propulsion systems
analysis. Compiler flag tuning would speed it up by factors of 2-3 or even
more on some architectures. The reason it was so effective is that the
structure of the code was designed to be general, but also to expose the
critical performance sections in a way that the compilers could use - deep
pipelining/vectorization, unrolling, etc, were carefully made easy for the
compilers to exploit in critical sections. Yes, this made the code in those
sections harder to read, but it was a common practice because it might take
weeks of runtime to get an answer and performance mattered.

The problem I see with general purpose DBMS code the way it's structured in
pgsql (and others) is that many of the critical performance sections are
embedded in abstract interfaces that obscure them from optimization. An
example is doing a simple "is equal to" operation has many layers
surrounding it to ensure that UDFs can be declared and that special
comparison semantics can be accomodated. But if you're simply performing a
large number of INT vs. INT comparisons, it will be thousands of times
slower than a CPU native operation because of the function call overhead,
etc. I've seen presentations that show IPC of Postgres at about 0.5, versus
the 2-4 possible from the CPU.

Column databases like C-Store remove these abstractions at planner time to
expose native operations in large chunks to the compiler and the IPC
reflects that - typically 1+ and as high as 2.5. If we were to redesign the
executor and planner to emulate that same structure we could achieve similar
speedups and the compiler would matter more.

- Luke

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Luke Lonergan 2006-12-11 18:37:51 Re: Looking for hw suggestions for high concurrency
Previous Message Ron 2006-12-11 18:20:50 Re: New to PostgreSQL, performance considerations