Quick Links

Re: Help with bulk read performance

From:	Nick Matheson <Nick(dot)D(dot)Matheson(at)noaa(dot)gov>
To:	Andy Colson <andy(at)squeakycode(dot)net>
Cc:	Jim Nasby <jim(at)nasby(dot)net>, Daniel(dot)S(dot)Schaffer(at)noaa(dot)gov, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Help with bulk read performance
Date:	2010-12-14 16:07:24
Message-ID:	4D07963C.2050507@noaa.gov
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hey all-

Glad to know you are still interested... ;)

Didn't mean to leave you hanging, the holiday and all have put some
bumps in the road.

Dan my co-worker might be able to post some more detailed information
here, but here is a brief summary of what I am aware of:

1. We have not tested any stored procedure/SPI based solutions to date.
2. The COPY API has been the best of the possible solutions explored to
date.
3. We were able to get rates on the order of 35 MB/s with the original
problem this way.
4. Another variant of the problem we were working on included some
metadata fields and 300 float values (for this we tried three variants)
a. 300 float values as columns
b. 300 float in a float array column
c. 300 floats packed into a bytea column
Long story short on these three variants a and b largely performed the
same. C was the winner and seems to have improved the throughput on
multiple counts. 1. it reduces the data transmitted over the wire by a
factor of two (float columns and float arrays have a 2x overhead over
the raw data requirement.) 2. this reduction seems to have reduced the
cpu burdens on the server side thus producing a better than the expected
2x speed. I think the final numbers left us somewhere in the 80-90 MB/s.

Thanks again for all the input. If you have any other questions let us
know. Also if we get results for the stored procedure/SPI route we will
try and post, but the improvements via standard JDBC are such that we
aren't really pressed at this point in time to get more throughput so it
may not happen.

Cheers,

Nick
> On 12/14/2010 9:41 AM, Jim Nasby wrote:
>> On Dec 14, 2010, at 9:27 AM, Andy Colson wrote:
>>> Is this the same thing Nick is working on? How'd he get along?
>>>
>>> http://archives.postgresql.org/message-id/4CD1853F.2010800@noaa.gov
>>
>> So it is. The one I replied to stood out because no one had replied
>> to it; I didn't see the earlier email.
>> --
>> Jim C. Nasby, Database Architect jim(at)nasby(dot)net
>> 512.569.9461 (cell) http://jim.nasby.net
>>
>>
>>
>
> Oh.. I didn't even notice the date... I thought it was a new post.
>
> But still... (and I'll cc Nick on this) I'd love to hear an update on
> how this worked out.
>
> Did you get it to go fast? What'd you use? Did the project go over
> budget and did you all get fired? COME ON MAN! We need to know! :-)
>
> -Andy

In response to

Re: Help with bulk read performance at 2010-12-14 15:51:39 from Andy Colson

Responses

Re: Help with bulk read performance at 2010-12-14 16:39:38 from Jim Nasby

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Mladen Gogala	2010-12-14 16:21:53	Re: Index Bloat - how to tell?
Previous Message	Andy Colson	2010-12-14 15:51:39	Re: Help with bulk read performance