Re: COPY FROM performance improvements

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Luke Lonergan <llonergan(at)greenplum(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alon Goldshuv <agoldshuv(at)greenplum(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: COPY FROM performance improvements
Date: 2005-08-10 08:15:00
Message-ID: 1123661700.3670.621.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches pgsql-performance

On Tue, 2005-08-09 at 21:48 -0700, Luke Lonergan wrote:

> The key thing that is missing is the lack of micro-parallelism in the
> character processing in this version. By "inverting the loop", or putting
> the characters into a buffer on the outside, then doing fast character
> scanning inside with special "fix-up" cases, we exposed long runs of
> pipeline-able code to the compiler.
>
> I think there is another way to accomplish the same thing and still preserve
> the current structure, but it requires "strip mining" the character buffer
> into chunks that can be processed with an explicit loop to check for the
> different characters. While it may seem artificial (it is), it will provide
> the compiler with the ability to pipeline the character finding logic over
> long runs. The other necessary element will have to avoid pipeline stalls
> from the "if" conditions as much as possible.

This is a key point, IMHO.

That part of the code was specifically written to take advantage of
processing pipelines in the hardware, not because the actual theoretical
algorithm for that approach was itself faster.

Nobody's said what compiler/hardware they have been using, so since both
Alon and Tom say their character finding logic is faster, it is likely
to be down to that? Name your platforms gentlemen, please.

My feeling is that we may learn something here that applies more widely
across many parts of the code.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Luke Lonergan 2005-08-10 09:02:48 Re: COPY FROM performance improvements
Previous Message Martijn van Oosterhout 2005-08-10 08:04:23 5 new entries for FAQ

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-08-10 15:29:49 Re: COPY FROM performance improvements
Previous Message Steve Poe 2005-08-10 06:49:07 Re: Table locking problems?