Skip site navigation (1) Skip section navigation (2)

Re: Reading data in bulk - help?

From: Chris Huston <chuston(at)bangjafwac(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Reading data in bulk - help?
Date: 2003-09-10 20:59:50
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
On Wednesday, Sep 10, 2003, at 11:16 America/Denver, Josh Berkus wrote:

> What about explaining your disk setup?   Or mentioning 
> postgresql.conf?   For
> somebody who wants help, you're ignoring a lot of advice and questions.
> Personally, I'm not going to be of any further help until you report 
> back on
> the other 3 of 4 options.

EEEK! Peace. Sorry I didn't include that info in the response.

1) Memory - clumsily adjusted shared_buffer - tried three values: 64, 
128, 256 with no discernible change in performance. Also adjusted, 
clumsily, effective_cache_size to 1000, 2000, 4000 - with no 
discernible change in performance. I looked at the Admin manual and 
googled around for how to set these values and I confess I'm clueless 
here. I have no idea how many kernel disk page buffers are used nor do 
I understand what the "shared memory buffers" are used for (although 
the postgresql.conf file hints that it's for communication between 
multiple connections). Any advice or pointers to articles/docs is 

2) Clustering - tried it - definite improvement - thanks for the tip

3) RAID - haven't tried it - but I'm guessing that the speed 
improvement from a RAID 5 may be on the order of 10x - which I can 
likely get from using something like HDF. Since the data is unlikely to 
grow beyond 10-20gig, a fast drive and firewire ought to give me the 
performance I need. I know experimentally that the current machine can 
sustain a 20MB/s transfer rate which is 20-30x the speed of these 
queries. (If there's any concern about my enthusiasm for postgres - no 
worries - I've been very happy with it on several projects - it might 
not be the right tool for this kind of job - but I haven't come to that 
conclusion yet.)

4) I'd previously commented out the output/writing steps from the app - 
to isolate read performance.

On Wednesday, Sep 10, 2003, at 05:47 America/Denver, Magnus Naeslund(w) 
> How are you fetching the data?
> If you are using cursors, be sure to fetch a substatial bit at a time 
> so
> that youre not punished by latency.
> I got a big speedup when i changed my original clueless code to fetch 
> 64
> rows in a go instead of only one.
That's an excellent question... I hadn't thought about it. I'm using a 
JDBC connection... I have no idea (yet) how the results are moving 
between postgres and the client app. I'm testing once with the app and 
the DB on the same machine (to remove network latency) and once with 
db/app on separate machines. However, I wonder if postgres is blocking 
on network io (even if it's the loopback interface) and not on disk?!

I'll definitely look into it. Maybe I'll try a loop in psql and see 
what the performance looks like. Thanks Magnus.

On Wednesday, Sep 10, 2003, at 07:05 America/Denver, Sean McCorkle 

>     I ended up solving the problem by going "retro" and using the
>     quasi-database functions of unix and flat files:  grep, sort,
>     uniq and awk.
That's an cool KISS approach. If I end up moving out of postgres I'll 
speed test this approach against HDF. Thanks.

This is a very helpful list,
- Chris

In response to


pgsql-performance by date

Next:From: William YuDate: 2003-09-10 22:08:48
Subject: Re: Reading data in bulk - help?
Previous:From: Bruce MomjianDate: 2003-09-10 20:15:01
Subject: Re: Query too slow

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group