Re: Very poor performance loading 100M of sql data using copy

From: John Rouillard <rouilj(at)renesys(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very poor performance loading 100M of sql data using copy
Date: 2008-04-29 15:16:22
Message-ID: 20080429151622.GP6622@renesys.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Apr 28, 2008 at 02:16:02PM -0400, Greg Smith wrote:
> On Mon, 28 Apr 2008, John Rouillard wrote:
>
> > 2008-04-21 11:36:43 UTC @(2761)i: LOG: checkpoints ... (27 seconds
> > apart)
> > so I changed:
> > checkpoint_segments = 30
> > checkpoint_warning = 150
>
> That's good, but you might go higher than 30 for a bulk loading operation
> like this, particularly on 8.1 where checkpoints are no fun. Using 100 is
> not unreasonable.

Ok. I can do that. I chose 30 to make the WAL logs span the 5 minute

checkpoint_timeout = 300

so that the 30 segments wouldn't wrap over before the 5 minute
checkpoint that usually occurs. Maybe I should increase both the
timeout and the segments?

> >shared_buffers = 3000
> >I don't see any indication in the docs that increasing shared memory
> >would help speed up a copy operation.
>
> The index blocks use buffer space, and what ends up happening if there's
> not enough memory is they are written out more than they need to be (and
> with your I/O hardware you need to avoid writes unless absolutely
> necessary).

I forgot to mention the raid 1/0 is on a 3ware 9550SX-4LP raid card
setup as raid 1/0. The write cache is on and autoverify is turned off.

> Theoretically the OS is caching around that situation but
> better to avoid it.

The system is using 6-8MB of memory for cache.

> You didn't say how much RAM you have,

16GB total, but 8GB or so is taken up with other processes.

> but you should
> start by a factor of 10 increase to 30,000 and see if that helps; if so,
> try making it large enough to use 1/4 of total server memory. 3000 is
> only giving the server 24MB of RAM to work with, and it's unfair to expect
> it to work well in that situation.

So swap the memory usage from the OS cache to the postgresql process.
Using 1/4 as a guideline it sounds like 600,000 (approx 4GB) is a
better setting. So I'll try 300000 to start (1/8 of memory) and see
what it does to the other processes on the box.

> While not relevant to this exercise you'll need to set
> effective_cache_size to a useful value one day as well.

This is a very lightly loaded database, a few queries/hour usually
scattered across the data set, so hopefully that won't be much of an
issue.

--
-- rouilj

John Rouillard
System Administrator
Renesys Corporation
603-244-9084 (cell)
603-643-9300 x 111

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Greg Smith 2008-04-29 15:58:00 Re: Very poor performance loading 100M of sql data using copy
Previous Message John Rouillard 2008-04-29 15:04:32 Re: Very poor performance loading 100M of sql data using copy